A variety of algorithms for visual tracking have been devised and applied to various fields of applications. The visual tracking is a challenging task since appearances of tracking targets involve in significant variations, and high-level scene understanding is often required to handle exceptions.
Tracking by detection algorithms are one of the common approaches to deal with the challenging task, which typically depends on bounding boxes for a representation of a target object. However, the tracking by detection often suffers from drifting problem when the target object involves substantial non-rigid or articulated motions.
Recently, tracking by segmentation algorithms relying only on pixel-level information have been actively proposed. However, the proposed algorithms are not sufficient to model semantic structure of the target object and some of them even utilize external segmentation algorithms, e.g., Grabcut.
As a result, visual tracking techniques employing mid-level cues have been proposed to handle non-rigid and deformable target objects. For example, one of the visual tracking techniques uses superpixels for discriminative appearance modeling by mean-shift clustering and by incorporating particle filtering to find an optimal state for the target object. Another one of the visual tracking techniques adopts a superpixel-based constellation model to deal with non-rigid deformations of the target object.
However, both of the visual tracking techniques mentioned above may be vulnerable to find semantic relations between superpixels since both techniques categorize each superpixel into foreground or background independently. To overcome limitations of those two techniques mentioned above, a technique based on a hierarchical representation for target object appearance using multiple quantization levels such as pixel, superpixel and bounding box is proposed.
In addition, another tracking technique using dynamic multi-level appearance modeling by maintaining an adaptive clustered decision tree utilizing information obtained from the three different levels is proposed as well. But, both of them requires an external segmentation algorithm such as Grabcut.
As such, all the existing approaches, or algorithms, have such drawbacks as mentioned above.
Thus, a novel tracking by segmentation algorithm with a framework using Absorbing Markov Chain is proposed in the specification of the present invention.
Particularly, the devised algorithm using AMC is well-suited for tracking target objects with the non-rigid and articulated motions. A segmentation for the target objects as well as initial segmentation mask are obtained naturally within the devised framework.
The devised algorithm distinguishes foreground and background objects accurately based on a result of projection operations that discriminate features of the target object more efficiently than metric learning.