Feature tracking, an image processing function, is the process of selecting features from an image and then tracking these features across multiple related images of the same visual scene. Each image is typically represented as an array of pixel values, and a feature in such an image is typically identified as a region of one or more pixels (or sub-pixels). The tracking defines the feature's path from image to image as a two-dimensional path in image coordinates. Tracking data can be further processed to generate an estimated path of the feature in three dimensions, based on the position of the feature and/or changes in camera position across frames in the original visual scene.
Feature tracking is the basis for several techniques whereby multiple feature points are simultaneously tracked across related image frames. These include techniques for tracking two-dimensional shapes across frames, for estimating three-dimensional paths of selected feature points, for estimating three-dimensional camera paths from multiple feature points, or for recovering estimated three-dimensional scene structure (including estimated depths of object surfaces) from feature tracking data. The use of feature tracking techniques in these applications can be very powerful, because they transform an image processing problem into a domain where the tools of geometry and knowledge of geometric constraints can be applied.
With standard feature tracking methods, the process generally follows these steps:
feature point selection: one or more feature points are selected in an initial image frame PA1 frame-to-frame tracking: each feature point is individually tracked through successive frames PA1 individual path estimation: the path for each feature point is estimated from its tracking data PA1 increased efficiency, by calculating multiple feature points and paths in a unified method and allowing additional feature points and their paths to be interpolated rather than computed; PA1 greater accuracy, by using a larger number of feature points and their paths; PA1 higher performance, by processing image frames in a pair-wise parallel fashion using a pyramid technique amenable to pipelined or even real-time operation; and PA1 improved robustness and reduced selection sensitivity, by using pyramid techniques and averaging over local neighborhoods to both guide and constrain feature tracking.
In frame-to-frame tracking of individual feature points, some common problems are initial selection sensitivity, lost features, broken paths, and bad matches.
Most feature tracking methods are highly sensitive to the initial selection of each feature point. Automated selection is typically done on criteria applied solely to the initial frame (such as choosing an area of high contrast). This selection can easily prove to be a poor choice for tracking in successive frames. Likewise, a manual selection made by a human operator may not be well suited for tracking over multiple frames.
When features are tracked independently, selection sensitivity becomes critical. Even when multiple features can be correlated and tracked as a group, reducing selection sensitivity depends on tracking all the features across multiple image frames while maintaining the correlation between them.
A feature can be "lost" due to imaging artifacts such as noise or transient lighting conditions. These artifacts can make it difficult or impossible to distinguish the feature identified in one frame from its surroundings in another frame. A feature can also be lost when it is visible in one frame but occluded (or partially occluded) in another. Feature occlusion may be due to changing camera orientation, and/or movement of one or more object(s) in the visual scene.
A lost feature can re-appear in yet another frame, but not be recognized as a continuation of a previously identified feature. This feature might be ignored, and remain lost. It may instead be incorrectly identified and tracked as an entirely new feature, creating a "broken path".
A broken path has two (or more) discontinuous segments such that one path ends where the feature was lost, and the next path begins where the feature re-appears. A single feature may therefore be erroneously tracked as multiple unrelated and independent features, each with its own unique piece of the broken path.
All the conditions that lead to a lost feature can also contribute to a "bad match". A bad match is a feature identified in one frame that is incorrectly matched to a different feature in another frame. A bad match can be even more troublesome than a lost feature or broken path, since the feature tracking algorithm proceeds as if the feature were being correctly tracked.
Some of the problems introduced by selection sensitivity, lost features, broken paths and bad matches can be addressed by adding a predictive framework into the feature tracking algorithm. A predictive framework can identify the most likely areas for matching a feature in successive frames. This can help reduce the number of lost features and bad matches, and also help to properly identify a feature that re-appears after being lost.
One predictive technique is to extrapolate the estimated path of each feature being tracked. But the predictive value of individual path extrapolation is problematic, particularly when the path has a limited number of data points. Errors can be reduced if the paths of multiple feature points can be correlated within the predictive model, and enough feature points are tracked and correlated across multiple frames. Information about relative camera positions between frames can assist in guiding and constraining the predictive model, but only if such information is available or can be reliably estimated.
The theoretical power of feature tracking methods has been demonstrated in experimental results and field trials, particularly in applications that derive higher-level scene information by tracking and correlating multiple feature points. But the limitations of current feature tracking methods, as discussed above, reduce their utility in many practical settings. A feature tracking method that substantially increases the number of feature points being simultaneously tracked, and tracks them within a constrained predictive framework, would greatly improve the utility of feature tracking within many application areas.