1. Field of the Invention
The present invention relates to a model-based coding/decoding method and a system for coding or decoding a face image, and more in particular to a model-based coding/decoding method and a system capable of coding or decoding a face image including an expression and motion accurately with a small amount of data.
2. Description of the Related Art
Conventionally, the high-efficiency coding of an image is basically performed by full automatic processing. This is because a vast amount and great variety of types of images to be coded are required to be processed in a sophisticated, complicated real-time way in all the applications including facsimile, TV conference, image monitor and video disk. The full automatic processing, however, limits the technical development to waveform coding and the improvement in coding efficiency has reached its limit.
A model-based coding scheme is intended to improve the coding efficiency further. A representative model-based coding scheme is an analysis-synthesis type of coding scheme. In the analysis-synthesis type of coding scheme, face image information constituting an input image is analyzed according to a predetermined image model called a wire-frame model. Specifically, the contour of the face and the feature points including the end points of the eyes and the end points of the mouth are extracted, used as analysis parameters and tracked temporally. The analysis parameters thus extracted are coded in high efficiency together with the texture information such as the initial face image and output to media such as a transmission path and a storage medium. At decoding end, the original image information is decoded from the analysis parameters and texture information transmitted or stored by media using the same wire-frame model as the one prepared at coding end.
The advantage of this analysis-synthesis type of coding scheme is that it can express an image using a small number of analysis parameters and a small amount of texture information on the initial image and therefore the coding is possible at a very low bit rate.
The conventional analysis-synthesis type of coding scheme, however, poses the problems described below.
In the model-based coding of a face image by the analysis-synthesis type of coding scheme, it may be required that the facial expression and the motion of a face image of another person, for example, an actor or an actress appearing on TV or movie (hereinafter referred to as a reference image) is fitted in the face image to be coded such as a face image of the user picked up by the camera. In such a case, the conventional method assumes that both the initial image to be coded and the initial reference image are expressionless. In the case where a facial expression appears in the initial image of the image to be coded or the reference image, therefore, the correspondence between the analysis parameters and the parameters of the wire-frame model is disrupted, so that proper coding becomes impossible. As a result, a correct facial expression and motion cannot be presented in a decoded face image.
Also, since the initial image and the wire-frame model are both singular in number, the model-based coding of an image of a rigid object like a pair of glasses overlapped on a soft object such as a face poses the problem of transforming the glasses undesirably with the change in facial expression.
A vast amount of data is required for sending out a plurality of face images (several hundred face images, for example) representing the motion such as the change in facial expression. This consumes considerable time for data transfer from the transmitting end to the decoding end.