This invention relates to a method and apparatus for coding a moving picture.
When 30 frames of an image are displayed sequentially in one second, the human eye is capable of perceiving these sequentially displayed frames as a moving picture. Accordingly, how to code each frame image efficiently without sacrificing quality is important.
In general, three methods of coding frames are known. These are as follows:
1. a method of coding interframe difference;
2. a method of motion-compensated coding of interframe difference; and
3. a method of intraframe coding.
Overall coding of a moving picture is carried out using these methods in suitable fashion.
1. Method of coding interframe difference
Assume that an image of one frame is composed of an n.times.n pixel block (hereinafter referred to simply as a "block"), by way of example. This coding method involves obtaining the difference between the information of a certain block in a frame to be coded and the information of a block at the same position in the previous frame and then coding this difference. With this method, the frame-to-frame difference becomes closer to zero the higher the frame correlation, or in other words, the smaller the change in the motion of the image frame to frame. Coding efficiency rises as a result. The block of difference data is subjected to a discrete cosine transformation, the transform coefficient data obtained is quantized and Huffman coding is applied. If all of the data becomes zero after quantization, this means that the quantized image is the same as that of the same block in the previous frame. Accordingly, there is no transmission of code data.
2. Method of motion-compensated coding interframe difference
This method involves matching a plurality of neighboring blocks centered on the block to be coded and a plurality of neighboring blocks centered on the block at the same position in the previous frame and then selecting the block that is most similar. The difference data thus obtained is subjected to a discrete cosine transformation and quantized. Huffman coding is applied to the quantized data.
3. Method of intraframe coding
This coding method involves directly applying the discrete cosine transformation to the entire frame to be coded, quantizing the transform coefficient data obtained and then coding the quantized data. This method provides a higher coding efficiency than is obtained with the two methods described above. However, if frame correlation is small, i.e., if the frame-to-frame image changes, as when there is a change of scene, the dynamic range increases and the amount of information in the coded data increases rather than decreases. Thus, this intraframe coding method is effective in cases where the correlation of a frame of interest with respect to the previous frame is small.
The Moving Picture Experts Group (MPEG) of the International Organization for Standardization (ISO) recommends that a combination of DCT (discrete cosine transformation) and motion compensation be adopted as an international standard. According to this method, data coded by the interframe coding method is transferred as intraframe data in 15-frame units (an interval of 0.5 sec in a case where 30 frames are displayed in one second). More specifically, in a case where coding based upon correlation with the previous frame has continued for a certain number of frames (or for a certain time), quantization error becomes large and this results in an image that appears unnatural to the human eye. The method recommended by the MPEG is for dealing with this problem.
Generally, in a case where intraframe data is produced at a fixed period, no problems arise if the image happens to change dynamically in conformity with this period. However, if the dynamic change in the image occurs just when coding based upon matching with the previous frame is being carried out, an unnatural image is displayed momentarily.
Accordingly, it has been contemplated to change over among the three coding methods at appropriate times. This will be described with reference to the flowchart of FIG. 16.
It should be noted that an overflow signal (signal 304 in FIG. 19) in the following description is a signal indicating that the amount of data held in a buffer memory, which is for temporarily storing the coded data, has attained a predetermined size. The signal possesses a certain amount of margin.
First, as step S101 of the flowchart, it is determined whether an overflow state has been attained. If the overflow state is in effect, it is decided to apply interframe coding processing to the pixel block of interest.
If the overflow state is not in effect, the program proceeds to step S102, at which it is determined whether the motion-compensated coding of interframe difference is to be carried out. More specifically, as shown in FIG. 17, there is a point decided by a difference value at which a block of interest is subjected to motion-compensated coding of interframe difference and a difference value at which a block of interest is not subjected to motion-compensated coding of interframe difference. It is determined whether this point is in a region MC in FIG. 17. Though it may seem acceptable to make the judgment based upon whether the point is above or below a straight line 1700 passing through the origin and having a slope of m=1, vector information regarding the block of interest also is generated by the processing for motion-compensated coding of interframe difference. A straight line 1701 in FIG. 17 is the result of taking account of the amount of data in this vector information.
If it is determined at step S102 that the block of interest lies within the region MC, then it is decided to execute processing for motion-compensated coding of interframe difference.
If it is determined at step S102 that the block of interest lies within a region labeled "INTER", then processing for motion-compensated coding of interframe difference is not executed and the program proceeds to step S103.
It is decided at step S103 to execute either intraframe coding processing or interframe coding processing. More specifically, a variance value of a difference in interframe data in a case where motion compensation is applied to the block of interest and a variance value of a difference in a case where the block of interest is subjected to processing without motion compensation are obtained. If the block of interest resides in a region labeled "INTRA" in FIG. 18, then intraframe coding processing is selected. If the block of interest resides in a region labeled "INTER" in FIG. 18, then processing for coding of interframe difference is selected.
In the selection processing described above, the selection is made using the statistics of a 16.times.16 pixel block of luminance data.
The construction of the coder will be described with reference to FIG. 19.
As shown in FIG. 19, block data entered from an input line 301 enters a subtracter 30 and a mode decision unit 37. A signal line 304 is an input signal line for an overflow signal of an output-code buffer. This signal enters the mode decision unit 37 and a masking unit 32.
The mode decision unit 37 performs decision processing in accordance with the above-described method upon referring to the block data that has entered from the signal line 303 and the previous-frame data that has been stored in a previous-frame memory 38. When it has been decided that the method of coding interframe difference is appropriate, the decision unit 37 reads data of a block at a position identical with that of the input block out of the previous-frame memory 38 and delivers this data on a signal line 308. When it has been decided that the method of motion-compensating coding of interframe difference is appropriate, the decision unit 37 reads the best matching block data out of the previous-frame memory 38 via line 307 and outputs this data on the signal line 308. When it has been decided that the intraframe coding method is appropriate, the decision unit 37 produces zero data and delivers this data on the signal line 308. At the same time, the decision unit 37 outputs, on a signal line 312, vector data representing the relative positions of the best matching block, which has been obtained by motion compensation, and the coded block.
The subtracter 30 obtains the difference between the block data from the signal line 308 and the input block data from the signal line 302. The resulting difference data enters the masking unit 32 through a DCT (direct cosine transformation) circuit 31.
When the signal from the signal line 304 indicates the overflow state, the mode decision unit 37 selects interframe-difference coding unconditionally. Further, the masking unit 32 masks all of the difference data to zero. The DCT coefficient data from the masking unit 32 is quantized by a quantizer 33 and the quantized data is fed into a coder 34 and a reverse quantizer 35.
Upon referring to a selection-mode signal 311 from the mode decision unit 37, the coder 34 assigns a Huffman code to the quantized DCT coefficient data and outputs the result on a signal line 313. By means of reverse quantization, the reverse quantizer 35 reproduces frequency data identical with that sent to an external decoder, not shown. The reproduced frequency data is transformed into a difference signal again by a reverse-DCT circuit 36; this difference signal is added to the signal from the signal line 308 by an adder 39, thereby reproducing an image identical with that transmitted. This image is stored again in the previous-frame memory 38.
With the processing described above, there is a tendency for the quantization steps to become large if the amount of information possessed by an image is very large with respect to the transmission rate. Accordingly, a drawback is that data that is actually quantized, coded and transmitted is limited to data possessing a sufficiently large amount of information in the state prior to quantization.
Further, the decision regarding which type of coding processing is performed is made based upon a fixed determination. As a consequence, even a block in which no data is left after quantization of the interframe difference, i.e., even a block for which it is unnecessary to send a code in the current quantization step, is subjected to inappropriate coding processing, as a result of which an unnecessary code is transmitted.
In a case where coding processing is performed at a constant bit rate, the average code quantity allocated to one block is extremely small. Consequently, with regard to a block for which the difference between the previous frame and the current frame is small, a method is employed in which the data of the previous frame is used as is and there is no transmission whatsoever of a code indicative of the difference. The code allocation for a block having a large difference is increased correspondingly.
However, when the correlation between two mutually adjacent frames is small over the entire frame, as when there is a change in scene, the coding of all blocks becomes necessary and the codes generated for individual blocks become large. At such time, the above-described coding method involves subjecting one block to coding processing and results in the generation of a large quantity of code. This means that many other blocks can no longer be coded.