Recent technological advancements in capturing and recording images include features that allow users to capture and record images in rapid succession, often within microseconds or seconds of each other, thus creating large sets of user photos. With the decrease in costs for storage, users often store a large number of their captured photos both on their cameras and in remote storage. Instead of reviewing and organizing photos on the camera or within storage when a user's memory about the recently captured photos is still fresh, users often simply upload the entire set to content management systems to review and organize their captured images at a later date.
As the number of photos both on the camera and within various storage avenues increases, the task of organizing stored photos can become overwhelming. Adding to the complexity of organizing their photos, a given user may also have images from multiple sources, such as, for example, images uploaded to a social network or photograph sharing service, such as Facebook or Instagram, images uploaded to a blog, as well as the original image which remains on his or her computer or digital camera. Or, for example, a user may have photos of essentially the same content, but taken by different persons at a family gathering or social event, which are then shared amongst all of the participants or invitees. If multiple images of the same—or very similar content—are uploaded by such users to content management systems or services, user storage, as well as system bandwidth, may be wasted, as well as uselessly cluttering one's image collection with little marginal benefit. Because users often do not inventory the various photos they upload to such services, or the quality and size of each, they generally have no facility to cull duplicates or near duplicates from their collections of content. Thus, as the number of photos stored for a given user increases, and multiple sources of often redundant content are drawn upon for storage by users, the issue of duplicate and near duplicate content becomes more and more acute. What is thus needed in the art are systems and methods to detect duplicate and near duplicate photos and images, and refer such detected duplications and near duplications to users and/or system resources for appropriate culling or decision making.