The problem of object classification has received considerable attention from both the computer vision and machine learning communities. A key challenge is to recognize any member in a category of objects despite wide variations in visual appearance due to geometrical transformations, change in viewpoint, or illumination. Two dimensional (2D) methods for classification of vehicles have emphasized the use of 2D bag of features or feature constellations from a set of limited (representative) views. In the last decade, the proliferation of 2D methods has been facilitated by the superabundance of images on the Internet as well as the systematic annotation and construction of image benchmarks and corpora. 2D approaches have yielded significant advances in recognition performance, particularly on controlled datasets.
Unfortunately 2D methods are limited in that they cannot leverage the properties of 3D shapes for recognition. The typical 2D method of handling view variance applies several single-view detectors independently and combines their responses via arbitration logic. Some recent work has focused on a single integrated multi-view detector that accumulates evidence from different training views. Such methods have only been successfully attempted with controlled datasets and with broad classification categories.
A more difficult task is to make classification decisions at a very fine level of distinction, e.g., between different types of vehicles rather than a distinction between the class of vehicles and the class of airplanes. For such a task, 2D methods that make broad generalizations over object classes with only a coarse utilization of geometric relations are ill suited and 3D models become indispensable.
Much of the early work in 3D model based recognition included methods for matching wire-frame representations of simple 3D polyhedral objects to detected edges in an image with no background clutter and no missing parts. Such methods further included aligning silhouettes of rendered models with edge information extracted from scene imagery. Unfortunately, this has resulted in mismatches due to faulty edge detection, lack of scene contrast, blurry imagery, scene clutter and noise amongst other factors complicating scene analysis.
Such prior art 3D model based recognition methods have been unable to harness appearance as a rich source of information. To date, there have been no attempts to accurately simulate scene conditions in the rendered model and to compare rendered models with the actual scene. Secondly, like 2D approaches, most of the work in prior art 3D models for classification has been geared towards broad categories of objects rather than a finer analysis, in part due to the limitations of employing silhouettes and edges.
Accordingly, what would be desirable, but has not yet been provided, is a 3D method and system for distinguishing between types of vehicle models in aerial imagery.