In a field of a digital audio data compression technique, a subband coding (Subband Coding: SBC) system, an adaptive transform coding (Adaptive Transform Coding: ATC) system and so on are generally employed in order to reduce a data amount. A digital equipment for recording audio data thus coded on a recording medium, transmitting them to a transmission line or signal-processing them may be increased in the future. In these coding systems, a digital audio data is divided into a block unit (audio frame unit) and then coded.
In the above subband coding system, an input audio signal is divided into signals of plural frequency bands and each of the divided signals of the plural frequency bands is independently coded by utilizing deviation of electric powers in each of the frequency bands. Specifically, the input audio signal is divided into the signals of plural subbands, which reduces deviation of a signal energy in each of the subbands, thereby a dynamic range being reduced. A bit corresponding to the signal energy of each of the subbands is allocated.
When the input audio signal is divided into the signals of the subbands, dichtomization of the frequency band is repeatedly carried out by using a plurality of orthogonal mirror filters (QMF), thereby the tree-structured subbands being obtained. Signal samples in the divided low frequency band and the divided high frequency band are each thinned out to 1/2, thereby sampling frequencies thereof becoming a half.
In a transform coding system, an input audio signal is subjected to linear transformation for increasing concentration of electric power and then quantized, thereby a coding efficiency being improved. A transform coding allocating adaptively a bit is particularly called the adaptive transform coding system. Fourier transform, cosine transform or the like, for example, is employed as the above linear transform.
When the subband coding system or the adaptive transform coding system is employed, if a signal is quantized after being subjected to some weighting by using a psycho-acoustic characteristic so that deterioration of the signal in a band which human being can perceive should be minimum, then it is possible to further improve a total coding quality.
Psycho-acoustic weighting (Psycho-acoustic Weighting) is a weighting method of sequentially calculating a temporal threshold value (Temporal Threshold) of an audible range by using an absolute threshold value (Absolute Threshold) thereof and a relative threshold value thereof determined by a masking effect. Bit allocation is carried out based on the above temporal threshold value.
A typical algorithm of an audio data coding system will be described in accordance with an algorithm of MPEG/Audio standard.
Initially, an input PCM audio data linearly quantized with 16 bits, for example, is converted from that of a time domain into signals of 32 frequency bands. A masking level for masking a quantization error based on the psycho-acoustic characteristic is calculated in order to allocate bits upon quantization.
The converted signals obtained as described above are quantized in accordance with the bit allocation based on the psycho-acoustic characteristic mode and then coded, thereafter being inserted into a frame together with a data which a user can arbitrarily define.
When the coded data is decoded, the data which a user can arbitrarily define is separated therefrom and the frame is deblocked and block data is decoded and inverse-quantized with reference to a supplied side information about the bit allocation. Then, the inverse-quantized signal is converted reversely to the coding processing, thereby the signal of the time domain being restored.
In the above MPEG/Audio standard, algorithms of three kinds, i.e., layer I, layer II and layer III are prescribed. In this case, while an algorithm becomes complicated in an order of the layer I, the layer II and the layer III, a sound quality is at the same time improved in the above order. The sound quality also depends upon a bit rate to be used. While bit rates of 14 kinds ranging from 32 kb/s to 448 kb/s, 384 kb/s and 320 kb/s are respectively prescribed for the layers I to III, bit rates which are main objects (target bit rates) of the respective layers are limited.
When, for example, the algorithm of the layer II is employed, as shown in FIG. 10B, an audio data is processed by an audio frame unit (one audio frame unit includes 1152 samples according to the layer II) by using an audio frame pulse shown in FIG. 10A and thereby converted into coded audio data, i.e, a bit stream.
As a digital signal processing technique is developed recently, a DVD video disk for storing data of one movie amount in an optical disk (whose diameter is 120 mm) is developed. This DVD video disk is manufactured by multiplexing coded video data, coded audio data an coded attached information such as a superimposed dialogue or the like to record the multiplexed data on one optical disk.
FIG. 11 shows an arrangement of a DVD video disk editing system (authoring system) employing an encoder for simultaneously encoding a video data and an audio data, by way of example. This authoring system has a digital VTR 101, an encoder 102, and a computer 103.
The digital VTR 101 has a video reproducing unit 101A for reproducing a digital video data from a video tape digitally recorded, and an audio reproducing unit 101B for reproducing a digital audio data from an audio tape TP digitally recorded.
The encoder 102 encodes a video data Dv and an audio data Da from the above digital VTR 101 under the control of the computer 103 and further multiplexes the encoded video data and the encoded audio data to output them as a series of bit streams.
Processings of encoding the above video data Dv and the above audio data Da are carried out in accordance with algorithms described in a video encoding program and an audio encoding program stored in a main memory device of the computer 103, respectively. Respective processings of multiplexing the encoded video data and the encoded audio data are carried out in accordance with an algorithm described in a multiplexing program stored in the main memory device of the computer 103.
The above authoring system encodes reproduced data from the digital VTR 101 and so on, i.e., the video data Dv and the audio data Da in the above example and converts the video data Dv and the audio data Da into data of respective predetermined data rates. Then, the authoring system rearranges both of the video data and the audio data in accordance with predetermined formats to record them on a recording medium such as an optical disk or the like.
When an audio data is encoded (Encoded) in a system for carrying out the authoring, if there is only one digital audio tape TP loaded onto an audio reproducing apparatus for reproducing an audio tape, then continuous processing of the audio tape from the head thereof makes encoded audio frames successive throughout.
If a plurality of digital audio tapes TP are loaded into the digital VTR 101, then when a digital audio tape TP loaded into the audio reproducing apparatus unit 101B of the digital VTR 101B of the digital VTR 101 is exchanged for another one, it is impossible to obtain a phase difference (offset) between an audio frame of an audio data da reproduced from the previously loaded digital audio tape TP and the latest reference timing (video frame) thereof. Therefore, an audio frame of an audio data to be reproduced from a newly loaded digital audio tape TP and the audio frame of the previously loaded digital audio tape TP are prevented from becoming continuous.
Specifically, when the audio data Da is encoded, a plurality of audio samples as one processing unit (an audio frame) are calculated. Since the number of the audio samples to be encoded as set forth is set to a value which is convenient to the coding calculation, a frame period of the audio frame inevitably has no relation with a period of a processing unit of video data Dv (a video frame).
Therefore, if the audio data is encoded with the audio and video frames being not synchronized with each other, then the audio samples having the same time code value may be inserted into different audio frames depending upon the coding processing (i.e., reproducibility may be lacked).
This lack of reproducibility will specifically be described with reference to FIG. 12. It is assumed that when there is a phase difference (offset) of time .tau.1 between a frame pulse at t1 of a video frame shown in FIG. 12A and an (N-1)th audio frame obtained when a first tape is reproduced as shown in FIGS. 12B and 12C, there is an offset of time .tau.2 between a frame pulse at t2 of the video frame and an (N)th audio frame, and there is an offset of time .tau.3 between the above time .tau.2 and an (N+l)th audio frame, after the (N)th audio frame is completely encoded, a second tape is inserted into the digital VTR 101 instead of the first tape and then reproduced.
In this case, when the second tape is reproduced, a frame period generally becomes different from a frame period obtained when the first tape is reproduced. It is usual that as shown in FIGS. 12D and 12E, an audio data from a frame prior to a target (N+1)th frame, e.g., an (N-2)th frame is recorded as a so-called "margin" on the second tape.
Therefore, when the audio data is transmitted to an encoding processing system of an encoder, the "margin" portion is skipped and the audio data from the target (N+1)th frame is transmitted. At this time, as described above, since a timing of a frame period with respect to the first tape is different from a timing of a frame period with respect to the second period, if at an edition timing at an edition point EP shown in FIG. 12F an audio data of the (N+1)th frame is derived in accordance with the frame period with respect to the second tape as it is, then an unnecessary audio sample of encoded data shown in FIG. 12G is added to a portion between the audio data of the (N)th frame from the first tape and the audio data of the (N+1) th frame from the second tape, which prevents the (N)th audio frame and the (N+1)th audio frame from being made continuous.
If the audio data is encoded under a state that the audio frames are discontinuous, then it prevents a decoder of a DVD player for reproducing a DVD video disk from correctly decoding the audio data, which provides problems for an authoring system.
In order to make audio frames of different tapes continuous, two methods can be considered: a first method of previously storing a plurality of audio data in some suitable means to thereafter edit them so that all the audio frames should be continuous; and a second method of dividing an audio data at a silent portion whose data may be encoded twice or removed if audio frames of different tapes are not aligned.
If the first method is employed, then a memory having a large capacity is required disadvantageously. If the second method is employed, then it is not practical because there is no ground that a silent portion is regularly produced on a tape.
In view of the above aspects, it is an object of the present invention to provide a data processing apparatus and a data processing method which, when information data supplied together with a reference timing while a plurality of sources are switched is divided into data of predetermined processing units and then encoded, can keep the processing units used for encoding continuous at a source switch point and allow the source to be divided at an optional point regardless of whether the point is a silent portion.
It is another object of the present invention to provide a data recording apparatus and a data recording method which, when a digital data formed by inserting another information data into a reference information data input together with a reference timing is divided into data of predetermined processing units and then encoded and the encoded data are recorded on a recording medium, can keep continuity of the processing units used for an encoding processing at an insertion point of the another data, and can make the encoded data of a reference information data and the encoded data of the another information data continuous at the above insertion point to record them on a recording medium.