Current imaging systems can convert live scenes into a sequence of digital images which can be processed to track any object in the scene from frame to frame. The techniques used for tracking are numerous. Most of the currently available systems use some characteristic of the subset of the image containing the target to search and locate the target in the following image. The quality and speed of the tracking system depends on the implementation of this search and locate idea.
Most tracking systems use correlation of a sample subimage representing the object with parts of the current image. The correlation values are computed in a search area around an estimated location of the object. The correlation operation is computationally expensive and usually is performed using specialized hardware.
Another set of tracking methods uses a 3D model of the object being tracked. In these methods, the model is mapped into the target location based on the location and illumination parameters. The disadvantage of such model based tracking methods is the relatively high amount of computation for the mapping from the 3D model to the image. The tracking systems that avoid the correlation or model matching approaches, use characteristics of the object's appearance or motion in estimating the location of the object in the current image. These techniques are faster than correlation methods but are less robust to changing shape and temporary occlusion by similarly colored objects in the scene.
The work by Darell et al. in U.S. Pat. No. 6,188,777 uses stereo cameras and involves three modules which compute the range of the tracked object, segments the object based on color and does pattern classification. Each of the modules involved places a large computational load on the computer. The method of Peurach et. al. in U.S. Pat. No. 6,173,066 uses a 3D object model database and projection geometry to find the pose of the object in the 2D camera image. The pose determination and tracking involves searching in a multi-dimensional object pose space. The computation involved is very high.
The method of Richards in U.S. Pat. No. 6,163,336 uses special cameras and infrared lighting and a specialized background. The method of Marques et. al. in U.S. Pat. No. 6,130,964 involves a layered segmentation of the object in the scene based on a homogenuity measure. The method also involves a high amount of computation. The template matching method proposed by Holliman et. al. in U.S. Pat. No. 6,075,557 which tracks subimages in the larger camera image involves search and correlation means relatively large amounts of computation. The method of Ponticos in U.S. Pat. No. 6,035,067 uses segmentation of the image based on pixel color. The system of Wakitani in U.S. Pat. No. 6,031,568 uses hardware to do template matching of the target. The method is computationally expensive correlation is done via hardware.
The tracking proposed in this method by Suito et. al. in U.S. Pat. No. 6,014,167 relies mostly on the difference image between successive frames to detect motion and then tracks moving pixels using color. This work uses correlation and searches in a multi dimensional space to compute the object's 3D position and orientation. The amount of computation involved is immense.
The proposed method of Matsumura et. al. in U.S. Pat. No. 6,002,428 does color matching to track the target. The method of Guthrie in U.S. Pat. No. 5,973,732 uses differencing and blob analysis. The method of Hunke in U.S. Pat. No. 5,912,980 uses color matching as opposed to shape. The method of Tang et. al in U.S. Pat. No. 5,878,151 uses correlation to track subimages in the image.