The invention relates generally to communicating sequences of image data over a narrow bandwidth communication channel such as, a telephone communications channel.
Image data, such as a video signal, consists of a sequence of images or frames. Since each frame contains a large amount of information, it is impossible to transmit the entire frame over a narrow bandwidth channel before the arrival of the next frame in the sequence. Accordingly, various techniques have been employed to compress the image data so as to reduce the number of bits of information to be transmitted. More specifically, these techniques take advantage of redundancies between successive frames, describing each frame in terms of changes from the previous frame.
In a typical video signal having moving objects, the principal change occurring between successive frames is the inhomogenous motion of objects within the field of view. Accordingly, the number of bits required to represent the sequence of frames can be reduced by describing each new frame in terms of the displacement of various components of the previous frame. This "motion compensation" technique requires substantially fewer bits to describe a sequence of images than other known data compression techniques. As a result, various motion compensating coding methods and apparatus have been developed employing this technique. These systems typically are either receiver-based motion compensation systems or transmitter-based motion compensation systems. In the receiver-based motion compensation system, the receiver makes a prediction as to the motion and compensates the previous frame for the expected motion. The transmitter, operating in the same manner, then sends only an error signal describing what must be done at the receiver in order to correct the receiver predicted frame. The error signal is typically coded to reduce its bandwidth.
For a transmitter-based motion compensation system, the motion estimation process occurs only at the transmitter. For example, the transmitter calculates displacement vectors representing the motion of various regions of the previous frame. This data is then transmitted to the receiver along with error information. The receiver first adjusts its image of the previous frame using the displacement vectors provided by the transmitter. The receiver then corrects the resultant image using the error information provided by the transmitter.
Typically, each displacement vector is associated with a specific region or block of the image. The blocks are typically non-overlapping and have, for example, a size of eight picture elements (pixels) by eight picture elements. Various methods have been employed for encoding the motion compensation data associated with each of the blocks. Ericsson, in his U.S. Pat. No. 4,849,810, the entire contents of which is incorporated herein by reference, describes a lossy coding method for encoding the motion-compensation displacement information.
The above described motion compensation systems work well in still and moving areas of a video frame. However, in typical video conferencing or video telephone sessions, a person is moving in front of a static background. The newly uncovered areas of the background cannot be predicted by displacement of the previous frame. The encoding of these unpredictable areas requires therefore a large number of bits.
Background prediction has been proposed as a solution to this problem. More specifically, the receiver maintains an image of the background of the field of view. When a person moves, thereby exposing a new portion of the background, the transmitter describes the exposed pattern with reference to the stored background image, thereby drastically reducing the number of bits required to encode the newly exposed areas.
Often, during heavy motion or scene changes, there exists substantial information to be transmitted, so that during a single frame time, insufficient bandwidth is available to transmit all of the information necessary to describe the frame. Accordingly, various methods have been implemented to encode and transmit the portion of the frame information which will minimize the image degradation. In his U.S. Pat. No. 4,849,810, Ericsson describes a hierarchical encoding method for efficiently communicating such image sequences so as to provide a more graceful degradation of the image during heavy motion or scene changes.
It is therefore an object of the present invention to transmit sequences of images over a narrow bandwidth communication channel using hierarchical encoding for efficiently communicating background prediction information. Another object of the invention is to reduce visible switching artifacts, which arise when switching between the background and motion compensated images, by independently selecting between the background and warped images at each resolution level of the hierarchically encoded images.