As exemplified with a digital VTR (Video Tape Recorder), a data recording and reproducing apparatus that records a digital video signal and a digital audio signal to a record medium and reproduces them therefrom is known. Since the data capacity of a digital video signal is huge, conventionally, it is compression-encoded corresponding to a predetermined method and then the encoded data is recorded to a record medium. In recent years, MPEG2 (Moving Picture Experts Group phase 2) is known as a compression-encoding standard.
In picture compression technologies such as the above-mentioned MPEG2, the data compression ratio is improved using variable length code. Thus, depending on the complexity of a picture that is compressed, the amount of compressed code per screen (for example, per frame or per field) fluctuates.
On the other hand, in a recording apparatus that records a video signal to a record medium such as a magnetic tape or a disc record medium, particularly, in a VTR, a predetermined unit such as one frame or one field is used as a unit of a fixed length. In other words, the amount of code per frame or field is limited to a predetermined value or less and recorded to a fixed capacity area of a storage medium.
The reason why the fixed length format is used for a VTR is in that since each record area on a magnetic tape as a record medium is composed of one frame, record data for one frame should be just placed in each record area. In addition, since the record medium is used corresponding to record time, the total amount of record data on the record medium and the remaining amount thereof can be accurately obtained. As another advantage, a program start position detecting process can be easily performed in a high speed searching operation. In addition, from a view point of controlling of a record medium, if the record medium is a magnetic tape, when data is recorded in the fixed length format, since the magnetic tape that is dynamically driven can be traveled at a constant speed, the magnetic tape can be stably controlled. Likewise, these advantages can apply to disc shaped record mediums.
The variable length code encoding format and the fixed length format have such contrary characteristics. In recent years, a recording apparatus that inputs a video signal as a non-compressed base band signal, compression-encodes the signal with variable length code corresponding to MPEG2 or JPEG (Joint Photographic Experts Group), and records the encoded signal to a record medium is known. In addition, a recording and reproducing apparatus that directly inputs and outputs a stream that has been compression-encoded with variable length code and records and reproduces the stream has been also proposed. In the following description, it is assumed that the compression encoding format for a digital video signal is MPEG2.
Next, the structure of an MPEG2 data stream will be described in brief. MPEG2 is a combination of a motion compensation predictive encoding and a compression encoding using DCT. MPEG2 data is hierarchically structured. The MPEG2 data is composed of a block layer as the lowest layer, a macro block layer, a slice layer, a picture layer, a GOP (Group Of Picture) layer, and a sequence layer as the highest layer.
The block layer is composed of DCT blocks each of which is a data unit for DCT. The macro block layer is composed of a plurality of DCT blocks. The slice layer is composed of a header portion and at least one macro block. The picture layer is composed of a header portion and at least one slice. One picture corresponds to one screen. The GOP layer is composed of a header portion, an I picture (Intra-coded picture), a P picture (Predictive-coded picture), and a B picture (Bidirectionally predictive-coded picture).
The I picture uses information of only a picture that is encoded. Thus, the I picture can be decoded as it is. The P picture uses an I picture or a P picture that has been decoded before the current P picture is decoded. The difference between the current P picture and the motion compensated predictive picture is encoded or the current P picture is encoded without the difference. One of them is selected for each macro block depending on which is more effective. The B picture uses (1) an I picture or a P picture that has been decoded before the current B picture is decoded, (2) an I picture or a P picture that has been decoded before the current B picture is decoded, or (3) an interpolated picture of (1) and (2). The difference between the current B picture and each of the three types of the motion compensated predictive pictures is encoded or the current B picture is encoded without the difference. One of them is selected for each macro block depending on which is the most effective.
Thus, as types of macro blocks, there are an intra-frame encoded macro block, a forward inter-frame predictive macro block of which a future macro block is predicted with a past macro block, a backward interframe predictive macro block of which a past macro block is predicted with a future macro block, and a bidirectional macro block that is predicted in both the forward and backward directions. All macro blocks in an I picture are all intra-frame macro blocks. A P picture contains an intra-frame macro block and a forward inter-frame predictive macro block. A B picture contains all the four types of macro blocks.
A macro block is a set of a plurality of DCT blocks and formed by dividing one screen (picture) into a lattice of 16 pixels×16 lines. A slice is formed by connecting macro blocks for example in the horizontal direction. The number of macro blocks per one screen depends on the size thereof.
In the MPEG format, one slice is one variable length code sequence. The variable length code sequence is a sequence of which the boundary of data cannot be detected unless variable length code is correctly decoded. When an MPEG stream is decoded, the header portion of a slice is detected so as to obtain the start point and the end point of variable length code.
In MPEG, conventionally, one slice is composed of one stripe (16 lines). The variable length encoding starts at the left edge of the screen and ends at the right edge of the screen. Thus, when a VTR has recorded an MPEG elementary stream, if it is reproduced at high speed, the VTR mainly reproduces the left edge of the screen. Thus, the screen cannot be equally updated. In addition, since the position on the tape cannot be predicted, if a tape pattern is traced at predetermined intervals, the screen cannot be equally updated. Moreover, if at least one error takes place, it adversely affects until the right edge of the screen. Thus, until the next slice header is detected, the error continues. Thus, when one slice is preferably composed of one macro block, such an inconvenience can be solved.
On the other hand, a video signal is recorded on a magnetic tape in helical track format of which tracks are diagonally formed with a rotating head. On one track, sync blocks, each of which is the minimum record unit, are grouped for each data type as sectors. In addition, data for one frame is recorded as a plurality of tracks.
In MPEG, to allow data to be accessed at random, a GOP (Group Of Picture) structure as a group of a plurality of pictures is defined. The provisions with respect to GOP in MPEG state that firstly the first picture of a GOP as a stream is an I picture and that secondly the last picture of a GOP in the order of original pictures is an I picture or a P picture. In addition, as a GOP, a structure of which a prediction using the last I picture or P picture of an earlier GOP is required is permitted. A GOP that can be decoded without need to use a picture of an earlier GOP is referred to as closed GOP.
In a digital VTR, an editing process is normally performed. The editing process is preferably performed in as small data unit as possible. When an MPEG2 stream has been recorded, one GOP may be used as an edit unit. In the structure of a closed GOP of which a GOP can be decoded without need to use an earlier GOP or a later GOP, an editing process can be performed for each GOP. However, when a GOP is composed of for example 15 frames, the editing unit is too large. Thus, it is preferred to perform an editing process in the accuracy of frame (picture).
However, when an MPEG stream contains a predictive picture that requires an earlier picture or both an earlier picture and a later picture for decoding the predictive picture, it becomes impossible to perform the editing process for each frame. Thus, preferably, all pictures are encoded with intra-frame code and one GOP is composed of one intra-picture. Such a stream satisfies the encoding syntax of MPEG2.
In addition, at the beginning of each of the sequence layer, the GOP layer, the picture layer, the slice layer, and the macro block layer, identification code composed of a predetermined bit pattern is placed. The identification code is followed by a header portion that contains encoding parameters of each layer. An MPEG decoder that performs an MPEG2-decoding process extracts identification code by a pattern-matching operation, determines the hierarchical level, and decodes the MPEG stream corresponding to the parameter information contained in the header portion. The header of each layer lower than the picture layer is information necessary for each frame. Thus, the header should be added to each frame. In contrast, the header of the sequence layer should be added to each sequence or each GOP. In other words, it is not necessary to add the header of the sequence layer to each frame.
Information contained in the header of the sequence layer is number of pixels, bit rate, profile, level, color difference format, progressive sequence, and so forth. These information is normally the same in all the sequence when it is assumed that one video tape is one sequence. According to the encoding syntax of MPEG, the header of the sequence layer can be added at the beginning of the video tape. In addition, according to the encoding syntax of MPEG, a quantizing matrix may be present in the header of other than the sequence layer (namely, the header of the sequence layer or the header of the picture layer). According to the encoding syntax of MPEG, the quantizing matrix can be added or omitted.
As information contained in the header of the picture layer, the accuracy of DC (Direct Current) coefficient of an intra macro block is set; the frame structure, field structure, and display field are designated; the quantizing scale is selected; the VLC type is selected; the zigzag/alternate scanning is selected; and the chroma format and so forth are designated. To allow an input picture to be effectively encoded corresponding to the characteristic thereof, the header of the sequence layer and the header of the picture layer can be changed for each frame.
In a digital VTR, an MPEG stream is recorded on a magnetic tape with a rotating head. Diagonal tracks are successively formed on the magnetic tape. In the normal reproducing operation whose tape speed is the same as the recording operation, since all recorded data can be reproduced, even if the header information is changed for each frame, no problem takes place. However, in the high speed reproducing operation whose tape speed is higher than the recording operation (for example, twice or higher), since data of the tape is fragmentarily reproduced, if information of the header is changed for each frame, a problem takes place.
FIG. 26 conceptually shows reproduced data in the high speed reproducing operation. Data of each of frame 1, frame 2, frame 3, . . . and so forth is composed of a header and picture data. There are a sequence header, a GOP header, and a picture header. The picture header is always added to each frame. In the high speed reproducing operation, data shaded in the drawing is fragmentarily reproduced from each frame. The obtained data reproduces a picture of one frame.
As was described above, to allow an input picture to be effectively encoded corresponding to the characteristic thereof, the header of the sequence layer and the header of the picture layer can be changed for each frame. Thus, if the header of frame 1 is different from the header of picture data of another frame, frame 1 cannot be correctly decoded.
Therefore, an object of the present invention is to provide a recording apparatus, a recording method, a reproducing apparatus, and a reproducing method that allow compression-encoded data fragmentarily reproduced in the high speed reproducing operation to be decoded to a picture.