Recent years have seen rapid technological development of automatic object detection in digital images. Indeed, as a result of the proliferation of personal computing devices and digital cameras, individuals and businesses now routinely manage large repositories of digital images and digital videos. Accordingly, automatic object detection in digital images has become a ubiquitous need for individuals and businesses in a variety of scenarios ranging from casual users seeking to locate specific moments from a personal photo collection to professional graphics designers sorting through stock images to enhance creative projects.
Unfortunately, conventional object detection systems suffer from a number of drawbacks. For example, the most accurate conventional object detection systems all involve one form of machine learning. These conventional machine learning methods require supervised learning (human-annotated training data) for training or they do not provide useful predictions. Generating annotated images for training is both time consuming and expensive.
Due to the need for supervised data, conventional object detection systems are only able to identify small numbers of types of objects. Indeed, conventional object detection system typically are only able to identify 20 types of objects with reasonable accuracy. Some conventional object detection systems can identify as many as 80 or even 120 different types of object but at unacceptable accuracy levels.
Accordingly, a need exists for robust, efficient, and effective detection of objects in large datasets of digital images.