1. Field of the Invention
The present invention relates to handling a particular part in multimedia data, and more particularly, to a method and apparatus for tracking a particular object in video sequence frames.
2. Description of the Related Art
As multimedia environments are becoming more diversified, demands for continuously tracking or extracting an object area in which a user is interested in ordinary video sequences such as movies, TV programs, and Commercial Films (CFs) are gradually increasing.
Technologies for tracking an object area in video sequences can be broadly divided into the following four methods.
First, the most widely used one is a technology based on block matching. This technology is relatively easy to implement, and if the shape of an object rarely changes, shows a satisfying matching (or tracking) performance between frames. Therefore, this method is basically used for estimating a block motion vector between frames in the MPEG-1, and MPEG-2 that are standard technologies for moving picture compression. However, if complex object transformation (scale change, rotation, or non-rigid motion) occurs in continuous frames, the probability of failure in block matching-based object tracking increases.
Secondly, there is a method based on a geometric model in which geometric characteristics of an object are modeled (e.g., wire-frame model) and using this, the model is found in input images [U.S. Pat. No. 6,269,172]. This method works well for transformation of a rigid body although it has partial occlusion. However, the implementation is complicated and needs a lot of calculation, and this method is not appropriate to non-rigid body object transformation. In addition, for each object, a separate model should be built.
Thirdly, there is a method based on an active contour model (or snake) [Kass et al., “Snakes: Active Contours Models”, IJCV, Vol. 2, 1988, pp. 321–331] [U.S. Pat. No. 6,266,443] [U.S. Pat. No. 6,259,802]. Like the above geometric model-based method, this method needs to set an appropriate contour model for a particular object, but shows effective tracking results for non-rigid body transformation to some degree. However, this method reveals disadvantage of being easily trapped at neighboring background image features.
Finally, there is a method based on color histogram information. This method shows relatively satisfying tracking results for complex object transformation and partial occlusion, and above all, has the advantage of fast processing speed [Swain, et al., “Color Indexing”, IJVC, Vol. 7, 1991, pp. 11–32] [U.S. Pat. No. 5,845,009][U.S. Pat. No. 6,226,388]. However, when a color similar to that of an object which is tracked is distributed in background areas adjacent to the object, the probability of failure in object tracking increases because this method gives a poor separation between the object and the similarly colored background. Since most technologies included in this method utilize templates having particular shapes (e.g., rectangle, ellipse) to represent object areas, they cannot provide accurate shape information of the objects in continuous object tracking process, and therefore, cannot effectively provide an adaptive compensation mechanism for both complex object shape deformation due to non-rigid object motion and the temporal change of the object color distribution due to illumination variation [G. R. Bradski, “Computer vision face tracking as a component of a perceptual user interface”, IEEE Work. On Applic. Comp. Vis., Princeton, 214–219, 1998] [D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of Non-rigid objects using mean shift”, IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, S.C., June 2000, vol.II, 142–149.]. Also, the tracking methods basically based on a color histogram have a problem of the sensitivity of the tracking performance to the set values of the color histogram bin resolution.