Various proposals have been made in recent years for the standaridazation of high efficent encoding for compressing video data. The high efficient encoding technique is to encode video data at a lower bit rate for improving the efficiency of digital transmission and recording. For instance, the CCITT (Comite Consultatif International Telegraphique et Telephonique or International Telegraph and Telephone Consultative Committee) has issued a recommendation for video-conference/video-telephone standardization H.261. According to the CCITT recommendation, the encoding is made using the intra-frame compression processed frame I and the inter-frame compression processed for the predictive frame compression processed frame p.
Referring now to FIG. 1, the video data compression standard according to the CCITT Recommendation will be explained.
The intra-frame compression processed frame I is one frame of video data encoded by the DCT (Discrete Cosine Transform) processing. The inter-frame compression processed frame P is the video data encoded by the predictive encoding using the intra-frame compression processed frame I or the inter-frame compression processed frame P. In addition, a further reduction of bit rate has been made by these encoded data to the encoded in variable length code data. As the intra-frame compression processed frame I was encoded by the intra-frame information only, it is possible to decode the intra-frame compression processed frame I by single encoded data only. The inter-frame compression processed frame P was encoded by correlations to other video data, thus the inter-frame compression processed frame P cannot be decoded by single encoded data only.
FIG. 2 is a block diagram showing the recording section of a conventional recording/playback apparatus using such a predictive encoding.
The luminance signal Y and color difference signals Cr and Cb are applied to a multiplexer 11, where they are multiplexed in a block of 8 columns and 8 rows (or 8.times.8 block). The rate of the color difference signals Cr in the horizontal direction is 1/2 of the luminance signal Y. Therefore, in the period when two 8.times.8 luminance (Y) blocks are sampled, one 8.times.8 block of the color difference signals Cr and Cb is sampled. As shown in FIG. 3, four blocks in total forms a macro block (MB). Here, two luminance signal blocks Y and each of the color difference signal blocks Cr and Cb represent the same position on the picture frame. Further, a plurality of macro blocks define a GOB (group of block) and a plurality of GOBs define one frame. The output of the multiplexer 11 is applied to a DCT unit 13 through a subtractor 12.
When performing the intra-frame compression, a switch 14 is kept OFF and the output of the multiplexer 11 is applied directly to the DCT unit 13 as described later. A signal block of 8.times.8 pixels is applied to the DCT unit 13. The DCT unit 13 converts the input signal into frequency components through the 8.times.8 two dimensional DCT processing. This makes it possible to reduce the spatial correlative components. The output of the DCT unit 13 is applied to a quantizer 15 which lowers one block signal redundancy by requantizing the DCT output using a fixed quantization coefficient. Further, block pulses are supplied to the multiplexer 11, the DCT unit 13, the quantizer 15, etc which operate in unit of block.
The quantized data from the quantizer 15 is applied to a variable length encoder 16 and is, for instance, encoded to the Huffman codes based on the result calculated from the statistical code amount of the quantized output. As a result, a short time sequence of bits is assigned to data having a high appearance probability and a long time sequence of bits to data having a low appearance probability and thus, the transmission amount is further reduced. The output of the variable length encoder 16 is applied to a error correcting encoder 17, which provides the output from the variable length encoder 16 with an error correcting parity added to a multiplexer 19.
The output of the variable length encoder 16 is also applied to an encoding controller 18. The amount of the output data varies largely depending on input picture signal. So, the encoding controller 18 monitors the amount of output data from the variable length encoder 16 and adjusts the amount of the output data by controlling the quantization coefficient of the quantizer 15. Further, the encoding controller 18 may restrict amount of the output data by controlling the variable length encoder 16.
A sync/ID generator 20 generates a frame sync signal and an ID signal representing data contents and additional information and provides them to the multiplexer 19. The multiplexer 19 forms one sync block data with a sync signal, an ID signal, a compressed signal data and a parity and provides these data to a recording encoder (not shown). The recording encoder, after recording/encoding the output from the multiplexer 19 according to charactersitic of a recording medium, records the encoded data on a recording medium (not shown).
If the switch 14 is ON, the current frame signal from the multiplexer 11 is subtracted from the motion compensated preceding frame data, which will be described later, in the subtracter 12 and applied to the DCT unit 13. In this case the inter-frame encoding is carried out to encode differential data using a redundancy of the inter-frame picture. When a difference between the preceding frame and the current frame is merely obtained, it will become large if there is any motion in the picture. So, the difference is made small by compensating for the motion by obtaining a difference at the pixel position corresponding to the motion vector while detecting the motion vector by obtaining the position of the preceding frame corresponding to the prescribed position of the current frame.
That is, the output of the quantizer 15 is also applied to an inverse quantizer 21. This quantized output is inverse quantized in the inverse quantizer 21 and further, inverse DCT processed in an inverse DCT unit 22 and restored to the original video signal. Further, the original information cannot be reconstructed completely in the DCT processing, requantization, inverse quantization and inverse DCT processing and part of the information lacks. In this case, as the output of the subtracter 12 is a differential information, the output of the inverse DCT unit 22 is also a differential information. The output of the inverse DCT unit 22 is applied to an adder 23. The output from the adder 23 is fed back through a variable delay unit 24 which delays signals by about one frame period and a motion compensator 25, and the adder 23 reproduces the current frame data by adding differential data to the preceding frame data and provides them to the variable delay unit 24.
The preceding frame data from the variable delay unit 24 and the current frame data from the multiplexer 11 are applied to a motion detector 16 where a motion vector is detected. The motion detector 26 obtains a motion vector through a full search motion detection by, for instance, a matching calculation. In the full search type motion detection, the current frame is divided into the prescribed number of blocks and the search range of, for instance, a 15.times.8 pixels block is set for each block. In the search range corresponding to the preceding frame, the matching calculation is carried out for each block and an inter-pattern approximation is calculated. Then, by calculating the preceding frame block which provides the minimum distortion in the search range, the vector which is obtained by the preceding frame block and the current frame block is detected as the motion vector. The motion detector 26 provides the motion vector thus obtained to the motion compensator 25.
The motion compensator 25 extracts a corresponding block data from the variable delay unit 24, compensates it according to the motion vector and provides it to the subtracter 12 through the switch 14 and also, to the adder 23 after making the time adjustment. Thus, the motion compensated preceding frame data is supplied from the motion compensator 25 to the subtracter 12 through the switch 14. When the switch 14 is ON, the inter-frame compression mode results. While if the switch 14 is OFF, the intra-frame compression mode results.
The switch 14 is turned ON/OFF based on a motion signal. That is, the motion detector 26 generates the motion signal depending on whether the motion vector size exceeds over a prescribed threshold value and applies it to a logic unit 27. The logic unit 27 controls the ON/OFF of the switch 14 by the logical judgement using the motion signal and a refresh periodic signal. The refresh periodic signal is a signal representing the intra-frame compression processed frame I, as shown in FIG. 1. If the input of the intra-frame compression processed frame I is represented by the refresh periodic signal, the logic unit 27 turns OFF the switch 14 irrespective of the motion signal. Further, if the motion signal represents that the motion is relatively fast and the minimum distortion by the matching calculation exceeds the threshold value, the logic unit 27 turns OFF the switch 14. Thus the intra-frame compression encoding is carried out for each block even when the inter-frame compression processed frame P data are input. TABLE 1, as shown below, represents the ON/OFF control of the switch 14 by the logic unit 27.
TABLE 1 ______________________________________ Frame I Intraframe Compression Switch 14 OFF Processed Frame Frame P Motion Vector Detected Switch 14 ON Inter-frame Compression Processed Frame Motion Vector Unknown Switch 14 OFF Inter-Frame Compression Processed Frame ______________________________________
FIG. 4 is an explanatory diagram showing the data stream of record signals which are output from the multiplexer 19.
As shown in FIG. 4, the first and the sixth frames of the input video signal are converted into intra-frame compression processed frames 11 and 16, respectively. While the second through the fifth frames are converted into inter-frames compression processed frames P1 through P5. The ratio of the data amount between the intra-frame compression processed frame I and the inter-frame compression processed frame P is (3-10); 1. The amount of data of the intra-frame compression processed frame I is relatively large, while the amount of data of the inter-frame compression processed frame P is extremely reduced. Further, the data of the inter-frame compression processed frame P cannot be decoded unless other frame data are decoded.
FIG. 5 is a block diagram illustrating the decoding section (playback section) of a recording/playback apparatus.
Compressed encoded data recorded on a recording medium is playbacked through a playback head (not shown) and then applied into an error correction decoder 31. The error correction decoder 31 corrects errors occurring in a data transmission and data recording. The playbacked data from the error correction decoder 31 through a code buffer memory 32 and decoder to prescribed length data. Further, the code buffer memory 32 may be omitted.
The output of the variable length decoder 33 is processed by an inverse-quantization in an inverse quantizer 34, and then decoded and restored to the original video signal by an inverse-DCT operation in an inverse DCT unit 35. The restored signal is applied to the terminal a of a switch 36. The output of the variable length decoder 33 is also applied to a header signal extractor 37. The header signal extractor 37 retrieves a header for determining whether the input data is the intra-frame compression data (intra-frame data) or the inter-frame compression data (inter-frame data) and then provides the header to the switch 36. When applied with the header representing the intra-frame compression data, the switch 36 selects the terminal a of the switch 36 and outputs the decoded data from the inverse DCT unit 35.
The inter-frame compression data is obtained by adding together the output from the inverse DCT unit 35 and the preceding frame output from a predictive decoder 39 using an adder 38. That is, the output of the variable length decoder 33 is applied to a motion vector extractor 40 for obtaining the motion vector. The motion vector is then applied to the predictive decoder 39. The decoded output from the switch 36 is delayed for one frame period by a frame memory 41. The predictive decoder 39 compensates the preceding decoded data from the frame memory 41 according to the motion vector and provides them to the adder 38. The adder 38 applies inter-frame compression data to the terminal b of the switch 36 by adding the output from the predictive decoder 39 and the output from the inverse DCT unit 35 together. When the inter-frame compression data is applied, the switch 36 selects the terminal b by the header and outputs the decoded data from the adder 38. Thus, the compression and the expansion of data are carried out without delay in both of the intra-frame compression mode and the inter-frame compression mode.
Various systems for recording such high efficient encoded digital video data on a magnetic video cassette recorder (VCR) have been developed. FIG. 6 is an explanatory diagram for explaining the recording tracks produced on a recording medium by this VCR.
In FIG. 6, A1, A2, . . . etc., represent the recording tracks by a plus-azimuth head, while B1, B2, . . . etc., represent the recording tracks by a minus-azimuth head. In this case, there is normally no problem especially in the playback operation. However, when the triple-speed mode playback is performed, the trace patterns by the heads are as illustrated by the arrow in FIG. 6 and only the hatched section where the head azimuth agrees with the azimuth of the recording track is playbacked. Even in this case, one picture can be playbacked during an analog recording operation where an on-screen position and the recorded position on a recording medium correspond to each other. However, the intra-frame compression processed frame I and the inter-frame compression processed frame P differ each other in their encoded amounts. If the data stream, as shown in FIG. 4, is recorded on a recording medium, one frame is not necessarily reconstructed from the playbaced data at the triple-speed mode playback. Further, the inter-frame compression processed frame P will not be able to be played back when any undecoded frame is generated as in the triple-speed mode playback, because the inter-frame compression processed frame P cannot be decoded as an independent frame. Furthermore, as data are playbacked discontinuously in this case, data after an interruption occurred cannot be used efficiently in a system for decoding an input data train continuously such as a video telephone, etc.
So, in order to make data correspond to positions on the picture screen, it is known to add block address information. However, undesired data other than video data may be added, thus lowering a data utilization coefficient. Further, it is also known to record data on a recording medium with address information added in correspondence with the on-screen position at the receiving section or to reconstruct intra-frame compression data by changing the format to fit to the memory. But data required for executing the format transformation is not transmitted. Furthermore, an effective decode using discontinuous playbacked data is not performed.
Further, in the Japanese Patent Application (TOKU-GAN-HEI) PO3-330650, the applicant of the present application has proposed "Variable Length Code Recording/Reproducing Apparatus" for format transformation and recording in a VTR without decoding received broadcasting signals. Furthermore, the applicant of the present application has also proposed "Transmission System" for adjusting data length and adding a skip code in consideration of a trick play operation of VCRs in the Japanese Patent Application (TOKU-GAN-HEI) PO4-067610.
Here an MPEG (Moving Picture Experts Group) Standard 1 has been proposed as a compression technique of a moving picture in a field of storage media. This MPEG Standard is adapted for semi-moving pictures at a transmission rate of 1.2 Mbps. The MPEG Standard is adapted for a CD-ROM, etc. FIG. 7 is an explanatory diagram showing the data structure of the MPEG Standard.
As illustrated in FIG. 7, the data structure of the MPEG system is hierarchical and a start code has been added to all the layers with the exception of the macro block layer. The lowest block layer is constructed with the 8.times.8 pixels block. The size of one pixel (one block) differs between the luminance component and the color difference component, as their sampling periods are different from each other. If a sampling ratio of the luminance component and the color difference component is 4:1, four luminance blocks correspond to one color difference block. For this reason, the macro block layer (corresponding to the small block in the "High Efficient Encoding/Decoding System") is constructed with a header added to two blocks each, total four blocks Y0 through Y3 and two blocks of color difference signals Cr and Cb in the column and row directions of the luminance component.
A slice layer (corresponding to the macro block in the "High Efficient Encoding/decoding System") composed of one or more macro blocks is formed by predictive encoding in the unit of this macro block, and one frame picture layer is formed by N pieces of the slice layer. Two-directional predictions, backward predictions, forward predictions or intra-picture predictions may be adopted for the predictive encoding of macro block. A GOP layer is constructed by several frame of picture layers. The GOP layer is constructed by bidirectional predictive frames (B pictures), forward predictive frames (P pictures) and intra-frame predictive frames (I pictures). For instance, if a prescribed frame is an I picture, all macro blocks are encoded for the slice layer using the intra-frame prediction. Further, in the case of a P picture frame, the macro blocks are encoded in the slice layer using the forward prediction or the intra-frame prediction. In the case of a B picture frame, the macro blocks are encoded using any one of the intra-frame prediction, the forward prediction and the backward prediction or both of the forward and the backward predictions. A video sequence layer is formed by a plurality of GOPs. Further, each of the headers of the video sequence layer, the GOP layer, the picture layer and the slice layer has a start code indicating the start of each layer, while the header of the macro block layer has a macro block address.
In the "High Efficient Encoding/Decoding System", a system including a macro block data length in a data string has been considered. However, in the MPEG system, no slice layer data length equivalent to this macro block data length exists. Further, if a quantization coefficient of the MPEG system is directly used, the conditional branch in the "High Efficient Encoding/decoding" can be achieved but no rule governing this is available at present. For these reasons, when considering an application to the MPEG system, there is a problem that it is difficult to handle them equally at detailed points although they are the same in principle.
Thus, in the conventional high efficient encoding/decoding system described above, there was a problem that the data utilization coefficient is lowered if address information is added to data at a data transmitting section. Further, there was another problem that data required for the format transformation was not transmitted to a data receiving section and, in addition, there was a problem that the decoding operation can not efficiently use the transmitted data when the data were transmitted in a discontinous fashion.
Further, there was another problem that it was difficult to apply the MPEG system on the existing high efficient encoding/decoding system.