Many approaches to encoding a sequence of digital video images are known in the art. One classical approach is to divide each frame in the sequence into square blocks of predetermined size also known as macroblocks. Each macroblock is then assigned a motion vector relative to a previous decoded frame, where the motion vector represents the offset between the current macroblock and a block of pixels of the same size in a previous reconstructed frame that forms a best match. The motion vector is transmitted to a decoder which can then reconstruct the current frame based upon the previous decoded frame, the motion vector and a prediction error. Block-based techniques, however, can lead to distortions such as blocking and mosquito effects in low bit-rate applications.
A more complex object-oriented, or region-oriented, approach encodes arbitrarily-shaped regions instead of rectangular or square blocks. While block-oriented coding techniques typically transmit two parameter sets, specifically the motion and color of each block, an object-oriented approach requires that the shape of each region be transmitted as well in order to allow reconstruction of the image. For example, in M. Hotter, "Object-Oriented Analysis-Synthesis Coding Based On Moving Two-Dimensional Objects," Signal Processing: Image Communication, Vol. 2, pp. 409-428 (1990), an encoder which encodes arbitrarily-shaped regions is presented, where objects are described by three parameter sets defining their motion, shape and color. A priority control determines in which of two modes the coded information will be sent based upon the success or failure of the motion estimation technique for a particular region. The shape coding technique considered in the aforementioned article approximates the shape of each region by a combination of polygon and spline representation of the shape. U.S. Pat. No. 5,295,201 also discloses an object-oriented encoder which includes an apparatus for approximating the shape of an arbitrarily-shaped region to a polygon. The vertices of the polygon are determined, and the coordinate values of the vertices are calculated and transmitted.
One color coding technique for use in object-oriented approaches is presented in Gilge et al., "Coding of Arbitrarily Shaped Image Segments Based On A Generalized Orthogonal Transform," Signal Processing: Image Communication, Vol. 1, pp. 153-180 (1989). According to the technique disclosed in this article, an intensity function inside each region is approximated by a weighted sum of basis functions which are orthogonal with respect to the shape of the region to be coded. While this technique may be theoretically useful, it is not practicable for implementation in a real-time system.
Due to the potential advantages of an object-oriented approach, there exists a need for an object-oriented encoder which provides powerful schemes for segmenting an image into arbitrarily-shaped regions, each of which has a corresponding motion vector, and for representing the segment content in a manner which can be readily implemented for use in real-time. It is also desirable to have an encoder which can encode a generic scene or sequence of images, the content of which is not known beforehand, in contrast to the requirements of the prior art. It is further desirable to provide an encoder which permits additional functionalities, such as tracking objects moving from one area of a scene to another between images in a sequence.