Image processing operations relating to network-based or web-based activities can provide a multitude of complications based on various factors associated with images. A common concern is the file size of the image and complications that arise from increased processing load associated with the file size. Another complication is the accuracy and efficiency of an image search operation.
For example, complications include increased data storage when storing multiple copies of an image. Based on the size of these images, it is inefficient to store multiple copies or near duplicates of the same image, but there are currently limited options for determining if the data storage is storing duplicate or near duplicate copies of the image. Existing techniques can include examining metadata or overhead data of the files and if there is a match, performing a direct image comparison. Direct image comparison is very expensive in processing requirements, thus not a viable option outside of very small scale operations.
Another example is the search results for a search operation. Search engines include the ability to perform an image search. Typically, this search is done based on metadata or related information associated with an image, e.g. tag data from a photo sharing site. Existing web-based systems do not have the ability to submit images as the searching input and output results are not optimized based on culling duplicate and near-duplicate images. Therefore, it is reasonable for a search result to include multiple copies of the same image or near duplicate images, retrieved from different locations, thus obfuscating the search results. Similarly, a person would be unable to perform a search operation to determine if anyone has improperly used his or her image, or even if another person has used and subsequently modified his or her image.
Existing systems do not account for image duplications based on the computational overhead associated with image processing. The web-based searching operations operate on a time factor and utilize techniques to improve searching speed while not detracting from search accuracy. It is unrealistic to perform straight image to image comparison in web-based searching operations, as the determination of duplicate images detracts from the speed of a searching operation. As such, there exists a need for a technique for determining near-duplicate images between multiple images.