The present invention relates to a method of processing picture signal and an apparatus for carrying out the method and more particularly to those suitably applicable to picture recording apparatuses, such as a digital video tape recorder (hereinafter called a digital VTR), recording compressed picture data packeted in units of sync blocks and picture signal transmitting apparatuses transmitting compressed picture data packeted in units of sync blocks.
There has so far been proposed a method (U.S. Pat. No. 5,021,879) as a compression coding system of a picture signal in which "motion compensation plus DCT (Discrete Cosine Transform)" processing is given to the picture signal so that an effective reduction of a large quantity of picture information is achieved.
In this compression coding method, the picture in one screen is first divided into sub-blocks (DCT blocks) as the minimum units for the compression and then a plurality of the sub-blocks are combined into a macro-block. Then, the motion vector representing the quantity of displacement between the picture of the previous frame and the picture of the incoming frame is calculated for each macro-block as the unit. The picture compensated for motion according to the motion vector is then subjected to DCT processing, quantization processing, and code word assignment processing in succession and thereby highly efficient encoding of the picture signal is achieved.
The compression coding method is characterized in that the motion vector is obtained for each macro-block as the unit and, therefore, as compared with the case where the motion vector is obtained for each sub-block as the unit, the motion vectors to be transmitted are reduced and the quantity of information to be transmitted or the quantity of information to be recorded as a whole is reduced. In the compression coding method, it is also possible, if required, to form a larger unit for processing by combining a plurality of macro-blocks together.
The "motion compensation plus DCT" processing will be described in concrete terms. Generally, there are present "spatial redundancy" and "temporal redundancy" in moving picture signals. In the case of still pictures, for example, the difference between the incoming frame and the previous frame is found to be zero. However, in most of the pictures, even when the pictures are not stationary, some constituent elements of the picture are moving with time. Therefore, by moving some part of the previous frame and the incoming frame suitably, the difference value can be decreased considerably. Such a process to produce smaller difference value by obtaining the displacement between the incoming frame and the previous frame, i.e., the motion vector, is called the "motion compensation".
At this time, the incoming frame, with respect to the previous frame, is not moving in the same direction all over the screen but directions of the motion are different from part to part, such that some part is moving down and other part is moving up. Therefore, it is practiced to divide the picture of a frame into some blocks and make the motion compensation for each block. The block as the unit for the motion compensation is the macro-block formed of a plurality of DCT blocks. The method giving the DCT processing to the thus obtained differential picture for compressing the same is the "motion compensation plus DCT" process. Thus, by transmitting DCT coefficients of the differential pictures and the motion vectors as motion information from the previous frame, the quantity of information of a picture can be reduced effectively.
In practice, the encoder employing the "motion compensation plus DCT" processing method is structured as shown in the encoder 1 of FIG. 4. An incoming picture signal S1 is input to a blocking circuit 2 and macro-blocks are formed in the blocking circuit 2 by combining a plurality of DCT blocks together. At this time, as shown in FIG. 5, the blocking circuit 2 first produces DCT blocks each thereof being formed of 8 pixels.times.8 lines of each of the planes of the luminance signal Y and color difference signals R-Y and B-Y and, then, produces macro-blocks by combining, for each macro-block, four adjoining DCT blocks of the luminance signal Y and one DCT block each of the color difference signals R-Y and B-Y located correspondingly to the luminance signal Y.
The encoder 1 then calculates a frame difference signal S2 between the incoming frame and the previous frame in a subsequent difference signal calculating portion 3 while making motion compensation for each macro-block as the unit. The differential signal calculating portion 3 supplies quantized data S3 output from a quantization device 5 to an adder 8 through a inverse-quantization device 6 and a inverse-DCT circuit 7. The adder 8 is also supplied with the output of a picture shift circuit 11 and, thereby, a picture signal the same as the incoming picture signal S1 is obtained in the adder 8 and this picture signal is supplied to a frame memory 9. The frame memory 9 delays the input picture signal by a one-frame period.
A motion vector detection circuit 10 receives the picture signal of the incoming frame from the blocking circuit 2, receives also the picture signal of the previous frame from the frame memory 9 and calculates the motion vector between the frames for each macro-block as the unit. A picture shift circuit 11 shifts the picture stored in the frame memory 9 by the quantity corresponding to the motion vector and then supplies the shifted picture to a difference circuit 12. Thus, the difference circuit 12 is enabled to obtain the frame difference signal S2 being close to zero and, thereby, the quantity of information is greatly reduced.
The frame difference signal S2 thus obtained is subjected to a DCT transform in a subsequent DCT circuit 4 and, thereby, it is transformed from information about the spatial axes to information about the frequency axes. As a result, the spatial redundancy is reduced. The output of the DCT circuit 4 is quantized in the subsequent quantization device 5 such that it satisfies the condition for a desired bit rate. The encoder 1 sends out the quantized data S3 thus obtained to a variable length coding circuit 13.
Variable length coded data S4 obtained by means of the variable length coding circuit 13 is given, as needed, an error correcting code in a subsequent error correcting code adding circuit 14 and, thereafter, the data is transmitted over a predetermined transmission line or recorded on a predetermined recording medium by means of such a recording apparatus as a digital VTR and a video disk apparatus.
In the encoder 1 of the above described type, the quantization device 5 is adapted to be controlled such that the variable length coded data S4 satisfies the condition for a desired bit rate. When the encoder 1 is used in a transmission system, for example, the quantization device 5 is continuously controlled so that a buffer (not shown) at the output stage may not overflow with the data.
On the other hand, when the encoder 1 is used in a recording apparatus such as a digital VTR, since such a recording apparatus uses a relatively small data unit having a plurality of macro-blocks combined into a packet called a sync block as the minimum unit in the recording and reproduction, the quantization device 5 is controlled so that the variable length coded data S4 may not overflow the sync block.
The sync block is formed of ten to several tens of macro-blocks combined into a packet and given such information as the block number and sync pattern. Further, the quantities of data placed in each of the sync blocks are adapted to be equal with one another. The reason why the sync block is formed of such a relatively small data unit is because it ensures detection of sync blocks at the time of high speed reproduction, such as shuttling, and it also reduces erroneous transmission.
In the case of transmission systems, the combination of blocks subjected to variable length coding does not produce much trouble because the buffer is large enough, but in the case of recording apparatuses, the sync block cannot be made so large and, therefore, the combination of blocks for structuring the sync block becomes an important problem.
For example, the case where sync blocks #0-#3 are structured as shown in FIG. 6 will be considered. In the case of FIG. 6, each of the sync blocks #0-#3 is formed of one macro-block as the unit, i.e., the sync block #0 is formed of the macro-block MB0, the sync block #1 is formed of the macro-block MB1, . . . Now, let us consider a picture of which the upper half is depicting the sky and the lower half is depicting a field of flowers. If it is assumed that the portion of the sky is assigned to the macro-blocks MB0 and MB1 and the portion of the field of flowers is assigned to the macro-block MB2 and MB3, since it is necessary that the quantities of data placed in each of the sync blocks are equal with one another, it follows that fine quantization is applied to the data placed in the sync blocks #0 and #1, while coarse quantization is applied to the data placed in the sync blocks #2 and #3. As a result, there arises a problem that the picture packeted into the sync blocks #2 and #3 deteriorates in picture quality.
Therefore, from the point of view of the quality of picture, it is desired that the sync block is formed of blocks gathered from different positions all over the screen (i.e., to shuffle the blocks). The reason is that, when the portion easy to compress and the portion difficult to compress are unevenly distributed on the screen as in the case where the upper half of the screen is depicting the sky and the lower half is depicting a field of flowers, if some sync blocks are formed only of the macro-blocks from the portion easy to compress, very fine quantization will be made for the sync blocks and, hence, the picture quality will become good. However, since other sync blocks are formed of only the macro-blocks from the portion difficult to compress, very coarse quantization will be made for these sync blocks and deterioration in the picture quality in these sync blocks will become conspicuous.
If, then, sync blocks are formed of macro-blocks gathered from here and there on the screen, equalization can be achieved to a certain degree and, therefore, it can be expected that deterioration in the picture quality due to quantization will become less than that in the case where sync blocks are formed of adjoining macro-blocks. However, since each macro-block is originally formed of a luminance signal Y and color difference signals located correspondingly to the luminance signal Y on the screen, it can be said that they are formed of blocks which are virtually equal with one another in difficulty in the compressing. As a result, there has been a problem that the effect of the averaging is lowered.
Further, when sync blocks are formed in units of macro-blocks as described above, there also arises such a problem that pictures well reproduced cannot be obtained at the time of high speed reproduction, such as shuttling reproduction, because of the block size being large.