Typically, classification of objects in an image is performed using features extracted from an analysis window which is scanned across the image. This sequential scanning search can be very computationally intensive, especially if a small window is used since a classification must be performed at each window position. Conventional approaches to reducing the computational load are based on reducing the search space by using another sensor such as radar to cue the vision system and measure the range of the object. Limitations of the radar approach include high cost, false alarms, the need to associate radar tracks with visual objects, and overall system complexity. Alternatively, previous vision-only approaches have utilized motion-based segmentation using background estimation methods to reduce the search space by generating areas of interest (AOI) around moving objects and/or using stereo vision to estimate range in order to reduce searching in scale. These methods add cost and complexity by requiring additional cameras and computations. Motion-based segmentation is also problematic under challenging lighting conditions or if background motion exists, as is the case for moving host platforms.
Motion-based systems form models of the static background in order to detect moving objects as “blobs” or silhouettes that do not match the background model. The performance will degrade, however, if the background contains high motion elements or if the camera is paning, zooming, or moving on a vehicle or aircraft or being carried by the user. Motion-based video analysis systems are also “brittle” in that the user must define rules for classifying the motion blobs that are specialized for each installation. These systems do not work well “out of the box” and require substantial setup and customization for each installation.
Additionally, there have been attempts to use genetic and evolutionary algorithms for object detection. Genetic algorithms (GAs) have been used before for decreasing the search space in vision systems. The GA systems employ a population of individual solutions that use crossover and mutation to maximize the fitness function. Other efforts have used GAs for training and adapting neural networks to recognize objects. The chromosome representation of solutions and cross-over operation in GA often result in large changes in the solution occurring as a result of small changes in the representation. This results in a “noisy” evolution of solutions and longer time to convergence.
Simulated annealing has also been used for optimization problems with discontinuous solution spaces with many local optima. However, the annealing schedule results in many more classifier evaluations than is necessary for cognitive swarms, making it impractical for real-time applications in computer vision.
Thus, a continuing need exists for an effective and efficient object recognition system for classifying objects in an image.