Recent developments in machine vision have demonstrated remarkable improvements in the ability of computers to properly identify particular objects in a viewing field. Most of these advances rely on color-texture analyses that require targets objects to possess one or more highly distinctive, local features that can be used as distinguishing characteristics for a classification algorithm. Many objects, however, consist of materials that are widely prevalent across a wide variety of object categories. For example, many mammals are covered with hair or fur, making a detector configured/trained to identify the presence of hair/fur a potentially good discriminator between mammals and non-mammals, but not a particularly good discriminator between different mammalian species. Similarly, many trees have leaves, many man-made objects are made of painted metal, and so forth, such that color-texture detectors configured/trained to identify leaves or painted metal are good for some categorizations, but not for others. Much less effort has been made to characterize objects based on shape, or the particular way the component features are arranged relative to one another in two dimensional (2D) image space.
The overarching goal of creating a machine that can see as well as a human has influenced prior researchers to focus on amplification of computing power to match that of the human visual system, i.e., requiring petaflops of computing power. Although historically such computing power has not been available to the vast majority of computer users, the advent of cloud computing and the introduction of graphical processing units, multi-core processors, smart caches, solid-state drives, and other hardware acceleration technologies suggests that access to sufficient computing power per se should not be the major impediment to effective machine-based object recognition going forward. More importantly, the goal remains to develop object-recognition systems that are sufficiently accurate to support commercial applications. Specifically, an algorithm capable of highly accurate object identification in an image, on the level of the typical performance of a Google® search based on a key word, is likely to receive widespread consumer acceptance. Accordingly, a system and method for highly accurate, automated object detection in an image or video frame may be beneficial.