This invention relates to encoding and decoding of complex video image information including motion components that may be found in multi-media applications such as video-conferencing, video-phone, and video games. In order to be able to transfer complex video information from one machine to another, it is often desirable or even necessary to employ video compression techniques. One significant approach to achieving a high compression ratio is to remove the temporal and spatial redundancy which is present in a video sequence. To remove spatial redundancy, an image can be divided into disjoint blocks of equal size. These blocks are then subjected to a transformation (e.g., Discrete Cosine Transformation or DCT), which de-correlates the data so that it is represented as discrete frequency components.
Motion film and video provide a sequence of still pictures that creates the visual illusion of moving images. Providing that the pictures are acquired and displayed in an appropriate manner, the illusion can be very convincing. In modem television systems it is often necessary to process picture sequences from film or television cameras. Processing that changes the picture rate reveals the illusory nature of television. A typical example is the conversion between European and American television standards which have picture rates of 50 and 60 Hz respectively. Conversion between these standards requires the interpolation of new pictures intermediate in time between the input pictures. Many texts on signal processing described the interpolation of intermediate samples, for a properly sampled signal using linear filtering. Unfortunately, linear filtering techniques applied to television standards conversion may fail to work. Fast moving images can result in blurring or multiple images when television standards are converted using linear filtering because video signals are under-sampled.
The benefits of motion compensation as a way of overcoming the problems of processing moving images are widely recognized in the prior art. Motion compensation attempts to process moving images in the same way as the human visual system. The human visual system is able to move the eyes to track moving objects, thereby keeping their image stationary on the retina. Motion compensation in video image processing attempts to work in the same way. Corresponding points on moving objects are treated as stationary thus avoiding the problems of under sampling. In order to do this, an assumption is made that the image consists of linearly moving rigid objects (sometimes slightly less restrictive assumptions can be made). In order to apply motion-compensated processing it is necessary to track the motion of the moving objects in an image. Many techniques are available to estimate the motion present in image sequences.
Motion compensation has been demonstrated to provide improvement in the quality of processed pictures. The artifacts of standard conversion using linear filtering, i.e., blurring and multiple-imaging, can be completely eliminated. Motion compensation, however, can only work when the underlying assumptions of a given subject are valid. If a subject image does not consist of linearly moving rigid objects, the motion estimation and compensation system is unable to reliably track motion resulting in random motion vectors. When a motion estimation system fails, the processed pictures can contain subjectively objectionable switching artifacts. Such artifacts can be significantly worse than the linear standards conversion artifacts which motion compensation is intended to avoid.
Motion vectors are used in a broad range of video signal applications, such as coding, noise reduction, and scan rate or frame-rate conversion. Some of these applications, particularly frame rate conversion, requires estimation of the “true-motion” of the objects within a video sequence. Several algorithms have been previously proposed to achieve true-motion estimation. Algorithms have also been proposed that seek to provide motion estimation at a low complexity level. Pel-recursive algorithms that generally provide sub-pixel accuracy, and a number of block-matching algorithms have been reported that yield highly accurate motion vectors.