One or more embodiments relate to processing visual information.
Processing techniques have been developed to detect features in video using, for example, Mixture-of-Gaussian (MOG), Hierarchical Bayesian, and Hidden Markov models. The features are located in one frame and then an attempt is made to find matching features in a subsequent adjacent frame or by performing block matching between adjacent frames. These techniques have proven to be time-consuming because of computational complexity and also have been found to be prone to errors as a result of lighting changes, occlusion, rotation, scale difference and other effects.
Additionally, these techniques perform a bottom-up approach to feature finding. Such an approach locates features based on regions, which, for example, may be pre-chosen fixed blocks of n×n size. A bottom-up approach also detects, segments, and tracks one feature and then attempts to detect, segment, and track increasingly greater numbers of features. When the number of features becomes large, objects cannot be detected or tracked with any degree of accuracy. Bottom-up approaches, therefore, have proven unsuitable for many applications.