Object tracking is an essential component of many computer vision applications such as robotics, video surveillance and video analysis. Generally, object tracking uses correspondences, motion, or contours of objects in successive frames of a video. Contour tracking is preferred when non-rigid objects are tracked. Unlike rigid object tracking, non-rigid object tracking considers contour variations due to translational and non-translational motion of the objects. Accurate contour information is an important descriptor in many object recognition applications such as military target detection, surveillance abnormal event analysis and object metrology.
Correspondence based tracking establishes correspondence between features on objects, I. Haritaoglu, D. Harwood and L. Davis. “W4: Who? When? Where? What? A real time system for detecting and tracking people,” AFGR, 1998; B. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” International joint conference on artificial intelligence, pages 674-679, 1981; and A. Yilmaz, X. Li and M. Shah, “Contour-based object tracking with occlusion handling in video acquired using mobile cameras,” IEEE transactions on pattern analysis and machine intelligence, 26(11), pp. 1531-1536, November 2004.
One method integrated temporal difference and template correlation matching for object tracking, A. J. Lipton, H. Fujiyoshi and P. S. Patil, “Moving target classification and tracking from real time video,” DARPA, pages 129-136, 1998. Another method adapts a mean-shift to track objects using histogram similarities in local kernels, D. Comaniciu, V. Ramesh and P. Meer, “Real-time tracking of non-rigid objects using mean shift,” CVPR, volume 2, pages 142-149, 2000. Other modalities can also be integrated within an object representation, F. Porikli and O. Tuzel, “Human body tracking by adaptive background models and mean-shift analysis,” Proceedings of IEEE Intl. conference on computer vision systems, workshop on PETS, 2003.
Motion based tracking estimates the movement of objects. Often, the objects are assumed to be planar shapes such as ellipses and rectangles, M. J. Black and D. J. Fleet, “Probabilistic detection and tracking of motion discontinuities,” Proc. of IEEE international conference on computer vision,” pages 551-558, September 2000; T. Jebara and A. Pentland, “Parameterized structure from motion for 3D adaptive feedback tracking of faces,” Proc. of IEEE computer society conf. on computer vision pattern recognition, pages 144-150, June 1997; and J. Shao, S. K. Zho and R. Chellappa, “Tracking algorithm using background-foreground motion models and multiple cues,” IEEE International conference on acoustics, speech and signal processing, March 2005.
Contour based tracking locates object contours in consecutive frames of a video. In a B-spline contour tracking process, a particle filter is used, M. Isard and A. Blake, “Contour tracking by stochastic propagation of conditional density,” Proceeding of ECCV, pages 343-356, 1996. The particle filter was initially used as a probability propagation model, N. J. Gordon, D. J. Salmond and A. Smith, “Novel approach to non-linear/non-Gaussian Bayesian state estimation,” IEEE proceedings on radar and signal processing, 140:107-113, 1993. When the particle filter is applied to rigid objects, good results can be obtained. However, that particle filter based method cannot extract an exact contour of a non-rigid object during tracking. Therefore, that method is less efficient when applied to non-rigid object tracking and video sequences with heavily cluttered backgrounds. A fixed ellipse is used to delineate an object of interest. The result can hardly reflect any information on shape deformation, which is regarded as important information in many computer vision related applications, such as military target detection, surveillance abnormal event analysis and object metrology.
It is favorable for tracking methods to provide accurate contours. One method applies a particle filter to non-rigid object contour tracking, P. Li, T. Zhang and A. E. C. Pece, “Visual contour tracking based on particle filters,” Image Vision Computing, 21(1):111-123, 2003. However, that method still is not an appropriate model for discriminating actual object boundaries from all edge points present in a video.
Snakes, also known as dynamic contours, are another common approach that evolves the object contour to minimize energy equations of an external energy and an internal energy. However, snake based methods are restricted to a relatively small range of scenarios due to the fact that the snakes rely on intensities inside objects to remain substantially uniform. In addition, the computational complexity of snakes is a drawback for real-time applications.
A level set method is another method that deals with topological changes of a moving front. The level set method uses partial differential equations (PDE) to describe object motion, contour and region-based information. However, level set methods are also computationally complex.