Digital images and videos are pervasive media forms of everyday life. Most people interact with digital images and videos multiple times a day. With the increasingly large amount of image and video data being generated and consumed, people need help identifying images and video. Therefore, many mainstream image indexing, search and retrieval tools, such as Google Image Search™ for example, exist to assist in managing the large amounts of image and video data available to the public.
These types of tools also find specific applications beyond the mainstream. Imaging, particularly digital imaging, is a critical diagnostic and research instrument in modern medicine. In the medical imaging field, content-based image retrieval (CBIR), which classifies an image based on the information contained within the image itself, is typically preferable over keyword or tag descriptor-based approaches, which require manual human annotation and professional judgment.
Most known CBIR approaches (used either in medical imaging applications or in other general and specific applications) rely on some form of feature detection. Examples of feature detection-based CBIR include Scale-invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF), and Binary Robust Invariant Scalable Keypoints (BRISK). The feature detection approaches are typically employed in a “bag of words” and “bag of features” model, which maps codewords or vectors to patches of an image representing the features of the image. The bag of words and bag of features models are generally designed to perform well at capturing the global appearance of the scene in an image or video frame. But these approaches may underperform in capturing spatial information and the local details of scene objects, such as the shape of a tumor in a medical imaging scan.
Furthermore, codewords and vectors used for mapping features incur large storage space requirements, which limit the real-time performance of feature detection-based CBIR systems.
Generally, in order to determine whether two images are similar, the CBIR system should preferably uniquely characterize each image such that the characterization of similar images exhibit considerable overlap. Conventional CBIR methods require sophisticated image characterization for acceptable image retrieval accuracy; however, sophisticated image characterization is inefficient and requires large data storage space and processing time.