1. Field of the Invention
The invention relates to a method of using MPEG standard in object segmentation, particularly, to a method of using MPEG-7 in object segmentation, which uses both human vision model and the jigsaw principle to recognize and recover the object.
2. Description of the Related Art
Currently, video object segmentation methods are splitted into automatic and semiautomatic ones. The automatic schemes segment an object based on the object""s moving information while the semiautomatic one segments an object based on user-defined segmentation. Automatic mode cannot process a still object and semiautomatic mode is inconvenient. The MPEG-4 technique of video object segmentation is described as follows.
In MPEG-4, before coding, a motion estimation, an object segmentation, and a shape coding are applied to each object in a video or an image. As mentioned, video object segmentation is regarded as one of the key advantages of MPEG-4 technique.
In MPEG-4, the video content data can be segmented into a plurality of Video Object Planes (VOPs) based on requirements and the video content. The VOPs can be separately coded (compressed), stored and transferred. Also, the VOPs can be recomposed, deleted and replaced using the MPEG-4 standard system based on the application needs. Because a VOP is a basic unit of interaction, the success of applying MPEG-4 to various types of interactive multimedia is restricted in that the VOP can be effectively separated from video signal.
In the MPEG-4 video model, a frame can be defined by multiple video objects and a very high interaction with a user is enabled for developing in more applications. The user can access the objects freely as needed to form the desired frame. In MPEG-4 video, an image is regarded as the combination of video objects by the VOPs. However, there is no standard format for a video object in MPEG-4 other than a presentation model used to represent the video object. This is shown in FIG. 1 to be clearly seen in the concept of VOP. FIG. 2 is the structure of a VOP encoder and a VOP decoder. With reference to FIGS. 1 and 2, in step S1 of FIG. 1 of the encoder section, we can see scene segmentation and depth layering in an object segmentation on this image in order to have the object definition on each segmented object. In step S2, the layered encoding action is performed on the segmented objects. At this point, the contour, motion, texture or coding information of each object is layered encoding as bitstreams through the multiplexer MUX, as shown in step S3, wherein the bitstreams include a background bitstream VOP1 and a broadcaster bitstream VOP2. In step S4, after the bitstream is transferred to the decoder, the demultiplexer DEMUX separately decodes the bitstreams VOP1 and VOP2. In step S5, the composition of the bitstreams VOP1 and VOP2 is performed to recover the original image.
In the technique view, MPEG-4 segments an image into different video objects to separately process the still and moving patterns in the image. The higher compression rate is used in the still background while the lower compression rate with respect to the still background is used in the moving foreground. In addition, when transferring the data, the background information is transferred only once; thereafter only the moving object information is transferred. Two kinds of information are composed on the client terminal. As such, the amount of data compressed and transferred can be greatly reduced.
However, the segmentation method above cannot solve the problems of still object segmentation and inconvenience.
An object of the invention is to provide a method of using MPEG-7 in an object segmentation, which uses both human vision model and the jigsaw principle to recognize and recover the object.
The invention provides a method of using MPEG-7 in an object segmentation, extracting the features of the object to build up a database, allowing the object to be recognized quickly from an image, regardless of whether the object is moving. The method includes the following steps. First, an object feature database is built up according to the MPEG-7 standardized definition. Second, watershed process is used to divide a video image frame into a plurality of objects when the video image frame exists. Third, each of the divided plurality of objects is compared to the object descriptor in the object feature database to find a target object. The most similar object including its shape and position in the video image frame is extracted as the target object based on the comparison result.