As exemplified with a digital VTR (Video Tape Recorder), a data recording and reproducing apparatus that records a digital video signal and a digital audio signal to a record medium and reproduces them therefrom is known. Since the data capacity of a digital video signal is huge, conventionally, it is compression-encoded corresponding to a predetermined method and then the encoded data is recorded to a record medium. In recent years, MPEG2 (Moving Picture Experts Group phase 2) is known as a compression-encoding standard.
In picture compression technologies such as the above-mentioned MPEG2, the data compression ratio is improved using variable length code. Thus, depending on the complexity of a picture that is compressed, the amount of compressed code per screen (for example, per frame or per field) fluctuates.
On the other hand, in a recording apparatus that records a video signal to a record medium such as a magnetic tape or a disc record medium, particularly, in a VTR, a predetermined unit such as one frame or one field is used as a unit of a fixed length. In other words, the amount of code per frame or field is limited to a predetermined value or less and recorded to a fixed capacity area of a storage medium.
The reason why the fixed length format is used for a VTR is in that since each record area on a magnetic tape as a record medium is composed of one frame, record data for one frame should be just placed in each record area. In addition, since the record medium is used corresponding to record time, the total amount of record data on the record medium and the remaining amount thereof can be accurately obtained. As another advantage, a program start position detecting process can be easily performed in a high speed searching operation. In addition, from a view point of controlling of a record medium, if the record medium is a magnetic tape, when data is recorded in the fixed length format, since the magnetic tape that is dynamically driven can be traveled at a constant speed, the magnetic tape can be stably controlled. Likewise, these advantages can apply to disc shaped record mediums.
The variable length code encoding format and the fixed length format have such contrary characteristics. In recent years, a recording apparatus that inputs a video signal as a non-compressed base band signal, compression-encodes the signal with variable length code corresponding to MPEG2 or JPEG (Joint Photographic Experts Group), and records the encoded signal to a record medium is known. In addition, a recording and reproducing apparatus that directly inputs and outputs a stream that has been compression-encoded with variable length code and records and reproduces the stream has been also proposed. In the following description, it is assumed that the compression encoding format for a digital video signal is MPEG2.
Next, the structure of an MPEG2 data stream will be described in brief. MPEG2 is a combination of a motion compensation predictive encoding and a compression encoding using DCT. MPEG2 data is hierarchically structured. The MPEG2 data is composed of a block layer as the lowest layer, a macro block layer, a slice layer, a picture layer, a GOP (Group Of Picture) layer, and a sequence layer as the highest layer.
The block layer is composed of DCT blocks each of which is a data unit for DCT. The macro block layer is composed of a plurality of DCT blocks. The slice layer is composed of a header portion and at least one macro block. The picture layer is composed of a header portion and at least one slice. One picture corresponds to one screen. The GOP layer is composed of a header portion, an I picture (Intra-coded picture), a P picture (Predictive-coded picture), and a B picture (Bidirectionally predictive-coded picture).
The I picture uses information of only a picture that is encoded. Thus, the I picture can be decoded as it is. The P picture uses an I picture or a P picture that has been decoded before the current P picture is decoded. The difference between the current P picture and the motion compensated predictive picture is encoded or the current P picture is encoded without the difference. One of them is selected for each macro block depending on which is more effective. The B picture uses (1) an I picture or a P picture that has been decoded before the current B picture is decoded, (2) an I picture or a P picture that has been decoded before the current B picture is decoded, or (3) an interpolated picture of (1) and (2). The difference between the current B picture and each of the three types of the motion compensated predictive pictures is encoded or the current B picture is encoded without the difference. One of them is selected for each macro block depending on which is the most effective.
Thus, as types of macro blocks, there are an intra-frame encoded macro block, a forward inter-frame predictive macro block of which a future macro block is predicted with a past macro block, a backward inter-frame predictive macro block of which a past macro block is predicted with a future macro block, and a bidirectional macro block that is predicted in both the forward and backward directions. All macro blocks in an I picture are all intra-frame macro blocks. A P picture contains an intra-frame macro block and a forward inter-frame predictive macro block. A B picture contains all the four types of macro blocks.
A macro block is a set of a plurality of DCT blocks and formed by dividing one screen (picture) into a lattice of 16 pixels×16 lines. A slice is formed by connecting macro blocks for example in the horizontal direction. The number of macro blocks per one screen depends on the size thereof.
In the MPEG format, one slice is one variable length code sequence. The variable length code sequence is a sequence of which the boundary of data cannot be detected unless variable length code is correctly decoded. When an MPEG stream is decoded, the header portion of a slice is detected so as to obtain the start point and the end point of variable length code.
In MPEG, conventionally, one slice is composed of one stripe (16 lines). The variable length encoding starts at the left edge of the screen and ends at the right edge of the screen. Thus, when a VTR has recorded an MPEG elementary stream, if it is reproduced at high speed, the VTR mainly reproduces the left edge of the screen. Thus, the screen cannot be equally updated. In addition, since the position on the tape cannot be predicted, if a tape pattern is traced at predetermined intervals, the screen cannot be equally updated. Moreover, if at least one error takes place, it adversely affects until the right edge of the screen. Thus, until the next slice header is detected, the error continues. Thus, when one slice is preferably composed of one macro block, such an inconvenience can be solved.
On the other hand, a video signal is recorded on a magnetic tape in helical track format of which tracks are diagonally formed with a rotating head. On one track, sync blocks, each of which is the minimum record unit, are grouped for each data type as sectors. In addition, record data for one frame is recorded at a predetermined record area. For example, record data for one frame is recorded with eight tracks.
In a digital VTR, an editing process is normally performed. The editing process is preferably performed in as small data unit as possible. When an MPEG2 stream has been recorded, one GOP may be used as an edit unit. In the structure of a closed GOP of which a GOP can be decoded without need to use an earlier GOP or a later GOP, an editing process can be performed for each GOP. However, when a GOP is composed of for example 15 frames, the editing unit is too large.
In MPEG, to allow data to be accessed at random, a GOP (Group Of Picture) structure as a group of a plurality of pictures is defined. The provisions with respect to GOP in MPEG state that firstly the first picture of a GOP as a stream is an I picture and that secondly the last picture of a GOP in the order of original pictures is an I picture or a P picture. In addition, as a GOP, a structure of which a prediction using the last I picture or P picture of an earlier GOP is required is permitted. A GOP that can be decoded without need to use a picture of an earlier GOP is referred to as closed GOP.
In MPEG, since coding is performed using a correlation of frames for each GOP, when an MPEG bit stream is edited, there is a restriction. In other words, when the end of a GOP matches an edit point, as long as the GOP is a closed GOP, no problem takes place. However, the length of one GOP is often as large as 0.5 seconds, the period as an edit point is too long. Thus, generally, it is preferred to perform an editing operation in the accuracy of frame (picture).
However, when an MPEG stream contains a predictive picture that requires an earlier picture or both an earlier picture and a later picture for decoding the predictive picture, it becomes impossible to perform the editing process for each frame. Thus, preferably, all pictures are encoded with intra-frame code and one GOP is composed of one intra-picture. Such a stream satisfies the encoding syntax of MPEG2.
In addition, at the beginning of each of the sequence layer, the GOP layer, the picture layer, the slice layer, and the macro block layer, identification code composed of a predetermined bit pattern is placed. The identification code is followed by a header portion that contains encoding parameters of each layer. An MPEG decoder that performs an MPEG2-decoding process extracts identification code by a pattern-matching operation, determines the hierarchical level, and decodes the MPEG stream corresponding to the parameter information contained in the header portion. The header of each layer lower than the picture layer is information necessary for each frame. Thus, the header should be added to each frame. In contrast, the header of the sequence layer should be added to each sequence or each GOP. In other words, it is not necessary to add the header of the sequence layer to each frame.
Now, the header of the sequence layer will be described. Information contained in the header of the sequence layer is number of pixels, bit rate, profile, level, color difference format, progressive sequence, and so forth. These information is normally the same in all the sequence when it is assumed that one video tape is one sequence. According to the encoding syntax of MPEG, the header of the sequence layer can be added at the beginning of the video tape. In addition, according to the encoding syntax of MPEG, a quantizing matrix may be present in the header of other than the sequence layer (namely, the header of the sequence layer or the header of the picture layer). According to the encoding syntax of MPEG, the quantizing matrix can be added or omitted.
Thus, the editing operation can be performed in the accuracy of one frame. However, the chronological relation between frames of a non-edited and those of an edited tape may be inverted. FIG. 25 shows an outline of such a problem. FIG. 25A shows a chronological relation of frames of a stream that has not been edited. Picture headers are added to picture data of frames 4 and 5, whereas headers of the sequence layer and the GOP layer are not added to the picture data of frames 4 and 5. However, using information of the header of the sequence layer and the header of the GOP layer added to the preceding frame 3, these picture data can be decoded.
When an editing process is performed, the chronological relation of frames is inverted. For example, as shown in FIG. 25B, picture data of frames 4 and 5 may be placed at chronological positions of frames 1 and 2 before the frame 3. As is clear from the example, after the editing process is performed, the header of the sequence layer may not be present at the beginning of the video tape. Alternatively, the header of the sequence layer may not be present at any position of one video tape. Thus, before a tape is edited, it satisfies the encoding syntax of MPEG. However, after the tape is edited, it may not satisfy the encoding syntax of MPEG. When the tape does not satisfy the encoding syntax of MPEG, the tape cannot be MPEG-decoded. As was described above, since one GOP is composed of one I picture, even if the header of the GOP layer is not obtained, no problem takes place. In the case that the header of the sequence layer is recorded only at the beginning of the tape, when the tape is reproduced from any position other than the beginning thereof, the header of the sequence layer may not be obtained. In addition, when the tape is inversely reproduced, as with the editing process, the chronological relation of frames is inverted. Moreover, in special reproducing operations such as high speed reproducing operation and slow reproducing operation, the header of the sequence layer may not be obtained.
In addition, according to the encoding syntax of MPEG a quantizing matrix may be placed in a header of other than the sequence layer. However, the encoding syntax of MPEG does not state that a quantizing matrix is always placed in each frame. Thus, as with the header of the sequence layer, the quantizing matrix may not be obtained before the MPEG decoding is performed after the editing process for each frame is performed. When the tape is reproduced from any position or when a special reproducing operation is performed, the header of the sequence layer may not be obtained.
Therefore, an object of the present invention is to provide a recording apparatus and a recording method that allow an editing process for each frame to be performed and a bit stream regularly reproduced from any position or reproduced in a special manner such as inverse reproduction to satisfy the encoding syntax with assurance.