1. Field of the Invention
The present invention relates to a data processing apparatus for and a data processing method of encoding dividing information data supplied together with a reference timing with a plurality of sources being successively switched, into data of a predetermined processing unit to encode them, and which are suitable for use in edition of data to be recorded on a digital video disk (DVD), for example.
2. Description of the Related Art
In a field of a digital audio data compression technique, a subband coding (SBC) system, an adaptive transform coding (ATC) system and so on are generally employed in order to reduce a data amount. A digital equipment for recording audio data thus coded on a recording medium, transmitting them to a transmission line or processing them may be increased in the future. In these coding systems, a digital audio data is divided into a block unit (audio frame unit) and then coded.
In the above subband coding system, an input audio signal is divided into signals of plural frequency bands and each of the divided signals of the plural frequency bands is independently coded by utilizing deviation of electric powers in each of the frequency bands. Specifically, the input audio signal is divided into the signals of plural subbands, which reduces deviation of a signal energy in each of the subbands, thereby a dynamic range being reduced. A bit corresponding to the signal energy of each of the subbands is allocated.
When the input audio signal is divided into the signals of the subbands, dichtomization of the frequency band is repeatedly carried out by using a plurality of orthogonal mirror filters (QMF), thereby the tree-structured subbands being obtained. Signal samples in the divided low frequency band and the divided high frequency band are thinned out to 1/2, thereby sampling frequencies thereof becoming a half.
In a transform coding system, an input audio signal is subjected to linear transformation for increasing concentration of electric power and then quantized, thereby a coding efficiency being improved. A transform coding allocating adaptively a bit is particularly called the adaptive transform coding system. Fourier transform, cosine transform or the like, for example, is employed as the above linear transform.
When the subband coding system or the adaptive transform coding system is employed, if a signal is quantized after being subjected to some weighting by using a psycho-acoustic characteristic so that deterioration of the signal in a band which human being can perceive should be minimum, then it is possible to further improve a total coding quality.
Psycho-acoustic weighting is a weighting method of sequentially calculating a temporal threshold value of an audible range by using an absolute threshold value thereof and a relative threshold value thereof determined by a masking effect. Bit allocation is carried out based on the above temporal threshold value.
A typical algorithm of an audio data coding system will be described in accordance with an algorithm of MPEG/Audio standard.
Initially, an input PCM audio data linearly quantized with 16 bits, for example, is converted from that of a time domain into signals of 32 frequency bands. A masking level for masking a quantization error based on the psycho-acoustic characteristic is calculated in order to allocate bits upon quantization.
The converted signals obtained as described above are quantized in accordance with the bit allocation based on the psycho-acoustic characteristic mode and then coded, thereafter being inserted into a frame together with a data which a user can arbitrarily define.
When the coded data is decoded, the data which a user can arbitrarily define is separated therefrom and the frame is deblocked and block data is decoded and inverse-quantized with reference to a supplied side information about the bit allocation. Then, the inverse-quantized signal is converted reversely to the coding processing, thereby the signal of the time domain being restored.
In the above MPEG/Audio standard, algorithm of three kinds, i.e., layer I, layer II and layer III are prescribed. In this case, while an algorithm becomes complicated in an order of the layer I, the layer II and the layer III, a sound quality is at the same time improved in the above order. The sound quality also depends upon a bit rate to be used. While bit rates of 14 kinds ranging from 32 kb/s to 448 kb/s, 384 kb/s and 320 kb/s are respectively prescribed for the layers I to III, respective target bit rates of the layers I to III are limited.
When, for example, the algorithm of the layer II is employed, as shown in FIG. 1, an audio data is processed by an audio frame unit (one audio frame unit includes 1152 samples according to the layer II) and thereby converted into coded audio data, i.e, a bit stream.
As a digital signal processing technique is developed recently, a digital video disk (DVD) for storing data of one movie amount in an optical disk (whose diameter is 120 mm) is developed. This DVD video disk is manufactured by multiplexing coded video data, coded audio data an coded attached information such as a superimposed dialogue or the like to record the multiplexed data on one optical disk. Fabrication of a master disk of this DVD video disk is called authoring.
When an audio data is encoded in a system for carrying out the authoring, if there is only one digital audio tape loaded onto an audio reproducing apparatus for reproducing an audio tape, then continuous processing of the audio tape from the head thereof makes encoded audio frames successive throughout.
If a plurality of digital audio tapes are loaded into the audio reproducing apparatus, then when a digital audio tape loaded into the audio reproducing apparatus is exchanged for another one, it is impossible to obtain a phase difference (offset) between an audio frame of an audio data reproduced from the previously loaded digital audio tape and the latest reference timing (video frame) thereof. Therefore, an audio frame of an audio data to be reproduced from a newly loaded digital audio tape and the audio frame of the previously loaded digital audio tape are prevented from becoming continuous.
Specifically, when the audio data is encoded, a plurality of audio samples as one processing unit (an audio frame) are calculated. Since the number of the audio samples to be encoded is set to a value which is convenient to the coding calculation, a frame period of the audio frame inevitably has no relation with a period of a processing unit of video data (a period of a video frame, i.e., a time code frame).
Therefore, if the audio data is encoded with the audio and video frames being not synchronized with each other, then the audio samples having the same time code value may be inserted into different audio frames depending upon the coding processing (i.e., reproducibility may be lacked)
This lack of reproducibility will specifically be described with reference to FIG. 2. When audio data of a tape TAPE1 and a tape TAPE2 are jointed, "a margin" where the same data are recorded is generally provided. It is assumed that audio data of the tape TAPE1 is switched to that of the tape TAPE2 at a start point P of a time code frame TN of the margin.
It is assumed that an audio frame including a data of a sound obtained when the tape TAPE1 at a point P is reproduced is an audio frame A1frame(N). It is assumed that an audio frame including a data of a sound obtained when the tape TAPE2 at the above point P is reproduced is an audio frame A2frame(N).
When the TAPE 2 is reproduced, a phase difference between the audio frame thereof and the time code frame is generally different from that between the time code frame and the audio frame obtained when the TAPE 1 is reproduced. In the example shown in FIG. 2, phase differences of the audio frames A1frame(N+1) and A2frame(N+1) with respect to the start point P are .pi.1 and .pi.2, respectively.
When the data streams obtained from the encoded audio data of the tapes TAPE1 and TAPE2 are jointed so that the audio frame A1frame(N) should be followed by the audio frame A2frame(N+1) to prevent the data at the start point P from being encoded twice, as shown in FIG. 2, two portions obtained by encoding the same audio data are repeatedly produced and further a time base shift of .pi.1-.pi.2 is produced after the joint point of the tapes TAPE1 and TAPE2, which disables a correct decoding processing of a DVD player. Therefore, an authoring system carrying out such edition has a disadvantage.
Since .pi.1&gt;.pi.2 is established in the example shown in FIG. 2, the same data is encoded twice, which leads to a backward shift of time base. If .pi.1&lt;.pi.2 is established, then some audio data is not encoded, which leads to a forward shift of time base.
In order to make audio frames of different tapes continuous, two methods can be considered: a first method of previously storing a plurality of audio data in some suitable means to thereafter edit them so that all the audio frames should be continuous; and a second method of dividing an audio data at a silent portion whose data may be encoded twice or removed if audio frames of different tapes are not aligned.
If the first method is employed, then a memory having a large capacity is required disadvantageously. If the second method is employed, then it is not practical because there is no ground that a silent portion is regularly produced on a tape.