The availability of memories capable of storing frames of data used in television as well as other solid state digital devices has made it possible to convey images within present 6 MHz channels that have much higher definition than those currently available. It has been known for many years that the bit rate required for televising or otherwise conveying images could be significantly reduced by transmitting the differences between the signals for adjacent frames. After a first frame is transmitted, successive frames could be formed at the receiver by making changes in accordance with the transmitted frame to frame difference signals.
One problem with this approach is that any errors introduced by noise and other effects are accumulative. Furthermore, it is necessary to transmit a nearly complete frame when there is a great deal of motion in the scenes. Therefore, systems employing motion compensation were developed. The current frame is effectively divided up into a number of discrete areas called motion blocks, and motion vectors for the x, y movements required for a matching block in the last frame to reach the position of the motion block in the current frame are derived. The criteria for the selection of a matching block may be the mean square error or mean absolute difference. The search for a matching block is limited to a search area surrounding a block having the same position in the last frame that the motion block has in the current frame. These motion vectors are transmitted to the receiver with a very few extra bits. What is called a predicted frame is formed both at the receiver and the transmitter by rearranging the matching blocks of the last frame in accordance with the motion vectors. One might conclude that this would be all that is necessary, but the predicted frames at the receiver and the transmitter are only predictions and subject to error which could become accumulative.
Therefore, the differences between the current frame, which, of course, is only available at the transmitter, and the predicted frame at the transmitter are derived and transmitted to the receiver, and the image to be displayed at the receiver is formed by adding these transmitted differences to its predicted frame.
Whereas advantageous bit rate reduction is attained in this manner, further reduction has been attained by transmitting the difference signals in coded form. The frames of difference signals are divided into contiguous data blocks that may or may not be the same size as the motion blocks. The data blocks of the current frame and the predicted frame are scanned, and the signals from the predicted frame are subtracted from the signals from the current frame so as to derive difference signals. The difference signals are applied to means such as a Discrete Cosine Transform, DCT, for deriving coefficients corresponding to different frequencies that appear in positions in a coefficient block corresponding to the pixels in the data blocks. A DCT places a value equal to the average value of all the pixels in a block at the upper left, which is D.C. In going to the right, the coefficients are for increasing discrete horizontal frequencies and in going down, the coefficients are for increasing discrete vertical frequencies. Thus, the coefficients in zigzag diagonal paths from the block at the upper left to the block at the lower right are for increasing discrete frequencies. The highest discrete frequencies are located in the lower right corner of the coefficient block. If the pixels in the original image are each represented by eight bits, the pixels in the difference signals are represented by nine bits and form the input to the DCT processing. The DCT coefficients are usually represented by 10 bits with one extra bit to represent the sign. Thus, there is no reduction in bits and therefore no reduction in bit rate at this point. However, a property of the DCT is that it decorrelates pixels into coefficients in such a way that for a normal image, the DC coefficient and the low frequency coefficients have large values while the high frequency coefficients have small value or even zero value. This property of the DCT is very valuable in compressing the data rate, especially when followed by a quantizer which quantizes the DCT coefficients coarsely. Further bit rate reduction can be achieved by coupling a Huffman coder between the output of the quantizer and the encoder output.
Formation of the predicted frame at the transmitter is effected by applying motion compensation to the last frame as previously described, and the next image frame is formed by inversing the effects of the quantizer and the DCT on the difference signals so as to recover the difference signals and adding the result to the predicted frame. This is the new last frame and the same as the image that is produced at the receiver.
In the interframe hybrid coding system using motion compensation described above, the objective is to compress the number of bits representing the image as much as possible without introducing significant visible distortion in the coded image.