Field of the Invention
The present invention generally relates to tracking object in video images, and more particularly, to a method that controls tracking using a color model of the object being tracked.
Background Information
When an object in a video image tracked by a conventional video tracker is lost, the tracker either stops tracking or follows something else. Some conventional algorithms are also designed to re-detect lost objects. Such a solution is usually based on local features, such as Kanade-Lucas-Tomasi (KLT), Scale-invariant feature transform (SIFT), and Speeded-Up Robust Features (SURF); or global features, such as color histograms. A conventional video tracker may also monitor the quality of tracking according to a confidence value computed by the tracker algorithm.
One example of a conventional video tracker is the Kanade-Lucas-Tomasi (KLT) feature tracker, which is proposed mainly for the purpose of dealing with the problem that other conventional image registration techniques are generally costly. KLT makes use of spatial intensity information to direct the search for the position that yields the best match. It is faster than other conventional techniques because KLT examines far fewer potential matches between image frames.
The KLT feature tracker is based on the paper, Bruce D. Lucas and Takeo Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” International Joint Conference on Artificial Intelligence, pages 674-679, 1981, where Lucas and Kanade developed the idea of a local search using gradients weighted by an approximation to the second derivative of the image, in an iterative process. The tracking is computed on features (i.e., points with their neighborhood) that are suitable for the tracking algorithm. See Carlo Tomasi and Takeo Kanade, “Detection and Tracking of Point Features,” Carnegie Mellon University Technical Report CMU-CS-91-132, April 1991. A post-processing of the points can be done using the technique disclosed in the paper, Jianbo Shi and Carlo Tomasi, “Good Features to Track,” IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, 1994, where an affine transformation is fit between the image of the currently tracked feature and its image from a non-consecutive previous frame. If the affine compensated image is too dissimilar the feature is dropped.
There is a need to improve a conventional video tracker in the area of detecting when the tracked object is lost, the area of re-detecting of the lost object, the area of responding to tracking loss, and other areas.