Due to growing interest in digital video communication on public switched telephone networks and/or mobile channels, the need for data compression has been focused on object-based coding of information to be transmitted from a sending unit to a receiving unit over such data links. Object-based coding (also referred to as model-based coding or analysis-synthesis coding) is evolving as a preferred approach to data compression since the quality of digital video provided by so-called hybrid waveform encoders (such as the ITU-T.H.261 encoder) is considered unacceptable when transmitted at the data bit rates commonly associated with switched telephone or mobile communication channels having a capacity in the range from about 14 to 25 kilobits per second (kbits/sec).
In object-based coding or data compression, an object is considered to be the significant feature of the scene content in a temporal sequence of image frames of video signals, such as for example, the head portion of each of the participants in a video phone conversation. Object-based coding can be classified as three-dimensional (3-D.lambda. object-based coding or as two-dimensional (2-D) object-based coding. Various approaches under these classifications are reviewed in a publication by H. Li, A. Lundmark, and R. Forchheimer, titled "Image sequence coding at very low bit rates: A Review," IEEE Trans. Image Proc., vol. 3, pp. 589-609, September 1994, and K. Aizawa and T. S. Huang, titled "Model-based image coding: Advanced video coding techniques for very low bit-rate applications," Proc. IEEE, vol. 83, no. 2, pp. 259-271, February 1995.
Existing 3-D object-based coding or data compression is not suitable for general purpose communication of video signals because it requires particular wire-frame model overlays on an image frame, based on prior knowledge of the scene content, and such 3-D object-based coding can not, in general, be implemented in real-time. While 2-D object-based coding or data compression does not require such prior scene content knowledge, present 2-D object-based approaches generally call for motion estimation strategies requiring either a global search of an entire image frame or a block-by-block search of the image frame, such relatively time-consuming searches being incompatible with real-time data compression and data transmission of a temporal sequence of image frames. Aspects of such search-based motion estimation strategies can be found in a publication by Y. Nakaya and H. Harashima, titled "Motion compensation based on spatial transformations," IEEE Trans. Cir. and Syst.: Video Tech., vol. 4, pp. 339-356, June 1994, and in a publication by Y. Wang and O. Lee, titled "Active mesh--a feature seeking and tracking image sequence representation scheme," IEEE Trans. Image Proc., vol. 3, pp. 610-624, September 1994.
The following illustrative example may serve to indicate the degree of data compression required in order to transmit in real-time a sequence of frames at a television standard frame rate of 30 frames per second: let a digitized image frame of video signals be represented by a 2-D array of 360.times.280 picture elements or pixels, each pixel having 8 bits corresponding to pixel signal levels for each one of three color components of a digitized color television signal. Accordingly, such a sequence of frames would have to be transmitted in an uncompressed form at a bit rate of 360.times.280 .times.8.times.3.times.30/second=72.58 Mbits/sec, so that frame-to-frame incremental motion of some portion of a scene content would be perceived as flicker-free scene content motion by an observer viewing successive image frames on a television display. If such a sequence had to be transmitted over a data link comprising a public-switched telephone network having a capacity of about 20 kbits/sec, a compression ratio of 72.58 Mbits/sec.div.20 kbits/sec.apprxeq.3.62.times.10.sup.3 --is necessary in this simplified example. Such exceedingly high data compression ratios are not currently attainable, and are in any event not compatible with current data compression standards such as, for example, pertaining to the H.261 standard governing the performance of hybrid waveform encoders, as well as proposed standards referred to as M-JPEG (Motion Joint Picture Experts Group) and MPEG (Motion Picture Experts Group).
In order to arrive at more readily attainable data compression ratios for successive image frames of video signals, investigators have concentrated on data compression of only those significant features or objects of the scene content of successive image frames which contain object motion on a frame-to-frame basis, such as described in the aforementioned publications. Although such search-based data compression can provide moderately high data compression ratios, it has not been possible to date to achieve those ratios at a television display compatible frame rate, that is at a real-time frame rate.