The present invention pertains generally to the field of video communications, and in particular, the invention relates to a system and method for using object prioritization and layered image coding in image/video transmission.
Video/image communication applications over very low bitrate channels such as the Internet or the Public Switch Telephone Network (PSTN) are growing in popularity and use. Conventional image communication technology, e.g., JPEG or GIF format, require a large bandwidth because of the size (i.e., amount of data) of the picture. Thus, in the low bitrate channel case, the received resulting image quality is generally not acceptable.
Methods have been used to improve video/image communication and/or to reduce the amount of information required to be transmitted for low bitrate channels. One such method has been used in videophone applications. An image is encoded by three sets of parameters which define its motion, shape and surface color. Since the subject of the visual communication is typically a human, primary focus can be directed to the subject""s head or face.
One known method for object (face) segmentation is to create a dataset describing a parameterized face. This dataset defines a three-dimensional description of a face object. The parameterized face is given as an anatomically-based structure by modeling muscle and skin actuators and force-based deformations. In such parameterized face models, a set of polygons may be used to define a human face. Each of the vertices of the polygons are defined by X, Y and Z coordinates. Each vertex is identified by an index number. A particular polygon is defined by a set of indices surrounding the polygon. A code may also be added to the set of indices to define a color for the particular polygon.
Systems and methods are also known that analyze digital images, recognize a human face and extract facial features. Conventional facial feature detection systems use methods such as facial color tone detection, template matching, edge detection approaches or disparity map methods.
In conventional face model-based video communications, a generic face model is typically either transmitted from the sender to the receiver at the beginning of a communication sequence or pre-stored at the receiver side. During the communication, the generic model is adapted to a particular speaker""s face. Instead of sending entire images from the sender""s side, only parameters that modify the generic face model need to be sent to achieve compression requirements.
Another coding scheme used in image transmission is layered source coding. In this coding scheme, video data information is decomposed into a number of layers, each represents different perceptually relevant components of the video source. The base layer contains the essential information for the source and can be used to generate an output video signal with an acceptable quality. With the enhancement layers, a higher quality video signal can be obtained.
FIG. 2 illustrates a typical video system 10 with layered coding and transport prioritization. A layered source encoder 11 encodes input video data. A plurality of channels 12 carry the encoded data. A layered source decoder 13 decodes the encoded data.
There are different ways of implementing layered coding. For example, in temporal domain layered coding, the base layer contains a bit stream with a lower frame rate and the enhancement layers contain incremental information to obtain an output with higher frame rates. In spatial domain layered coding, the base layer codes the sub-sampled version of the original video sequence and the enhancement layers contain additional information for obtaining higher spatial resolution at the decoder.
Generally, a different layer uses a different data stream and has distinctly different tolerances to channel errors. To combat channel errors, layered coding is usually combined with transport prioritization so that the base layer is delivered with a higher degree of error protection. If the base layer is lost, the data contained in the enhancement layers may be useless.
The inventor has realized that there are benefits in using aspects of model-based coding and layered source coding techniques to improve performance, in particular, using prioritization in object coding for image/video transmission.
It is an object of the present invention to address the limitations of the conventional video/image communication systems and model-based coding discussed above.
One aspect of the present invention is directed to prioritizing object identified in an image.
An other aspect of the present invention is directed to masking certain objects based upon the assigned priority and encoding the unmasked objects separately from the masked objects.
One embodiment of the invention relates to a method for a method for coding data in an image/video communication system including the steps of identifying at least two objects with in an image, assigning models to represent the objects and prioritizing the objects in accordance with predetermined prioritization rules. Communication channels are assigned to communicate data related to the models for the two objects so that a higher priority object is assigned to a communication channel having a reliability factor higher than a lower priority object.
These and other embodiments and aspects of the present invention are exemplified in the following detailed disclosure.