A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
1. Field of the Invention
This invention relates to temporal post-processing of compressed moving images and processes for improving quality of compressed moving images having low frame rates.
2. Description of Related Art
Phone lines are commonly used for digital communications between computers or other devices using modems and standard communications protocols. Such communication protocols have bit rates limited by the quality of transmission over the phone lines. For example, the V.FAST and H.26P standards for PSTN lines have bit rates between 16.8 and 28.8 kbit/s depending on the quality of the connection. These bit rates are low when compared to the bandwidth needed for transmitting high quality digital moving images, especially if the bandwidth also carries audio and/or other information.
Conventional moving images are a series of frames (or still images which are displayed sequentially. The frames can be represented digitally by two-dimensional arrays of pixel values which indicate colors and/or intensities of pixels in the frames. Transmission of uncompressed pixel values by videophones is impractical because of the large amount of data required to transmit every pixel value in every frame of a moving picture. Accordingly, videophone systems contain encoding circuits which compress a series of two-dimensional arrays of pixel values into codes representing the moving image and decoding circuits which convert codes back into a series of two-dimensional arrays.
Frame difference coding such as DPCM (differential pulse coded modulation) is a well known compression technique that removes redundant information from a representation of a moving image. Frame difference coding subtracts pixel values of a preceding frame from pixel values of a current frame and extracts non-zero values which indicate changes between the frames. Redundant data, data repeated in successive frames, appear as zeros in the difference frame, and the large number of zeros can be efficiently coded or removed. Motion estimation techniques further reduce the number of non-zero values by subtracting from each block in a current frame, a block in a preceding frame at a position indicated by a motion vector. The motion vector is selected to reduce or minimize the difference.
Even with compression techniques, the bit rate of a communication channel limits the maximum frame rate, number of frames per second, for a moving image. For example, the H.26P standard limits video transmission to less than ten frames per second. At low frame rates, motion in displayed moving images appears jittery or discontinuous rather than smooth. Accordingly, processes for improving moving image quality at low bit rates are widely sought.
In accordance with an embodiment of the invention, a video decoding and display system uses temporal post-processing to smooth motion displayed in moving images. The temporal post-processing uses motion vectors to generate one or more interpolated frames which are inserted in the moving image. The motion vectors indicate an offset between similar areas in first and second frames. The similar areas typically contain an object that moves in the moving image. Interpolated motion vectors, which are determined from the motion vectors, indicate interpolated positions of moving objects at times between the first and second frames. One or more interpolated frames inserted between the first and second frames show the objects at the interpolated positions indicated by the interpolated motion vectors.
One embodiment of the invention, decodes and displays a moving image by: decoding first and second consecutive frames from a signal representing the moving image; generating interpolated motion vectors from motion vectors which identify blocks in the first frame which are similar to blocks in the second frame; generating one or more interpolated frames from the interpolated motion vectors; and displaying a series of frames including the first frame followed by the interpolated frames followed by the second frame. Display times of the first and second frames are delayed so that the display times of the interpolated frames are proportionally spaced between the first and second frames.
In addition to the motion vectors, information used in generating the interpolated frame includes pixel values from the first and/or second frames and/or values from difference blocks used to generate the second frame from the first frame. Typically, the motion vectors and difference blocks are decoded from a signal which represents the second frame; but alternatively, a decoder determines motion vectors from the firsthand second frames during post-processing.
Another embodiment of the invention is a process for forming a moving image which includes: determining a motion vector that indicates a relative offset between first and second areas of the moving image, wherein the first area in a first frame of the moving image is visually similar to the second area in a second frame of the moving image; and generating a block of interpolated pixel values from pixel values representing the first area in the first frame and pixel values representing the second area in the second frame, wherein the block of interpolated pixel values represents a third area in an interpolated frame.
Still another embodiment of the invention is a method for generating a two-dimensional array representing an interpolated frame. The method includes: filling a buffer alternatively with a dummy value or with weighted average values from a first and second frames; and generating a block of pixel values for each motion vector, wherein each generated block of pixel values is determined from the pixel values which represent a base area of a corresponding motion vector for the second frame and pixel values which represent an area of the first frame that is offset from the base area by an amount indicated by the motion vector. Each generated block of pixel values is written to the buffer at storage locations which correspond to an area which is offset from the base area by a fraction of the motion vector. The pixel values written replace some or all of the average or dummy values in the third buffer. Any remaining dummy values are replaced with pixel values that are weighted averages of corresponding pixel values from the first and second frames.