This invention relates to signal coding and, more particularly, to a method and apparatus for encoding and decoding video signals of moving images.
Video signals typically originate from video cameras. The bandwidth of video signals is quite substantial and consequently, practioners in the art have tried to reduce the bandwidth of these signals without unduly degrading the the images. Typically to reduce bandwidth the video signals are encoded and redundancies in the encoded signals are extracted and deleted. Different techniques are used in the art and some are better suited for still images, while others are better suited for moving images. One of the techniques for reducing the bandwidth of moving images is generally referred to as motion compensated predictive coding.
In conventional motion compensated predictive coding, each video frame is first partitioned into square blocks of picture elements (pels); such as blocks fo 8 pels by 8 pels. Each block is coded, in turn, and the developed encoded sequence is transmitted over a communications channel to a decoder. The communications channel may be, or may include, a storage element. Next, a determination is made as to whether or not the pels of the block have changed significantly compared with the previous frame. If not, an indicator signal is sent which signifies to the decoder that it needs to merely repeat the pels of that block from the previous frame obtain the pels for the current block. This is known as "Conditional Replenishment". If the pels have changed since the previous frame, an attempt is made to determine the best estimate of motion that is occurring in the block. This is frequently done by a "Block Matching Motion Estimation" technique wherein the pels of the current block are successively compared with various small shifts of the corresponding block in the previous frame. The shift that gives the best match is deemed to be the "best estimate" of the displacement in the block's image between frames, and the amount of this shift, called the "Motion Vector", is selected and sent to the decoder.
The pels of the current block are then compared with those of the "best" shifted block from the previous frame to see if there is a significant difference. If not, an indicator signal is sent to the decoder, which merely causes the pels of the shifted block from the previous frame to be repeated for the pels for the current shifted block. Such blocks are said to have been successfully "Motion Compensated". However, if there is a significant difference between the two blocks, the difference is encoded and sent to the decoder so that the pels of the current block may be more accurately recovered. Coding of this difference is typically performed by means of the "Discrete Cosine Transform" (DCT).
The volume of code that is generated by the above procedure is variable. It can be appreciated, for example, that image changes that do not correspond to a uniform translation, or motion, of the image may require substantial encoding to describe the deviation of a block from its best translated replica. On the other hand, when the image does not change between successive frames, then there is a minimal amount of information that needs to be encoded. To accommodate these potentially wide variations in the amount of code that needs to be transmitted, typical encoders include a FIFO memory at the output, to serve as a buffer.
The FIFO is not a panacea. For a given transmission rate, when an excessive volume data is generated, there is always a danger that the FIFO would overflow. When it does, coding must stop until the transmission channel can empty the FIFO sufficiently so that new data to be accepted into it. Since it is inconvenient to stop encoding in the middle of a frame, most systems discard an entire frame whenever the FIFO buffer is full, or nearly so. To compensate for the loss of a frame, such systems cause the decoder to repeat its most recently available frame. Frame repeating results in moving objects in the scene being reproduced in a jerky fashion, rather than in the smooth way that would occur if frame repeating were not invoked.
There have been some suggestions for improving the quality of the repeated frames in order to make them more faithfully resemble the original. One technique is called "Motion Compensated Interpolation". With this technique, instead of simply repeating the pels from the previous frame, the Motion Vectors are used to laterally displace the block by the appropriate amount prior to display. In other words, this method creates the missing block of pels by averaging over the immediately previous and following blocks of pels that are available to the decoder. While this might seem to be a good idea, experimental results show that when the images of successive blocks do not represent translational motion, the reproduced image may be worse than with frame repeating. Although it has been observed that this degradation is caused by a relatively few pels that do not conform to the assumption of translational motion, putting these pels in the wrong place creates highly visible artifacts.