1. Field of the Application
The present application generally relates to image processing and object recognition, and more specifically relates to locating and identifying objects from a plurality of images in a database.
2. Related Art
In the context of digital content (e.g., images, video, etc.), the consumer may view and interact with content through a dedicated network device (e.g., desktop computer, laptop computer, smart phone, tablet computer, personal digital assistant, and/or any other type of processing device). Conventionally, there is no suitable way to identify product information about objects present in such content in an automated, rapid, or scalable manner. Rather, product information is generally conveyed via a point-and-click approach tailored for a single object represented in a single image. For example, an image, representing the product, may be displayed with a hyperlink to the product information.
It would be advantageous if a network device, running an interactive application (e.g., a consumer application) or other content, could be provided (e.g., by a media analysis server) with product information about objects present in that content.
It would also be advantageous if unknown content could be discovered (e.g., by a media analysis server) when browsing a large database of content. This way, a list of object(s) that are visually similar to a known object, but previously unknown due to the large size of the content database, could be returned to a network device running an interactive application, via a content server.
It would also be advantageous if objects could be located (i.e., detected) from content, and corresponding visually similar objects could be identified (e.g., by a media analysis server) solely based on visual characteristics without the need to add any information in text form, such as keywords or labels, or other types of metadata stored in a database.
It would also be advantageous if objects could be located (i.e., detected) from content, and corresponding visually similar objects could be identified (e.g., by a media analysis server) and information about these objects could be returned (e.g., by the media analysis server) to a network device running an interactive application to allow visual interaction without modification or distortion of the objects, without the need for special markings on the objects, without requirement for special lighting conditions, and all without human intervention in the process (i.e., automatic).
It would also be advantageous if information about objects located (i.e., detected) and identified from content (e.g., by a media analysis server) could be returned (e.g., by the media analysis server) to a network device running an interactive application and presented without obscuring the desired digital content or perimeter frames, without annoying the user/consumer with pop-up windows, as commonly practiced by conventional solutions for capitalizing on advertising revenue from digital content.
It would also be advantageous if objects could be located (i.e., detected) from content, and corresponding visually similar objects could be identified (e.g., by a media analysis server) within a reasonable amount of time, suitable for interactive applications.