This disclosure relates to image processing systems and, more particularly, performing visual searches with image processing systems.
Visual search in the context of computing devices or computers refers to techniques that enable a computer or other device to perform a search for objects and/or features among other objects and/or features within one or more images. Recent interest in visual search has resulted in algorithms that enable computers to identify partially occluded objects and/or features in a wide variety of changing image conditions, including changes in image scale, noise, illumination, and local geometric distortion. During this same time, mobile devices have emerged that feature cameras, but which may have limited user interfaces for entering text or otherwise interfacing with the mobile device. Developers of mobile devices and mobile device applications have sought to utilize the camera of the mobile device to enhance user interactions with the mobile device.
To illustrate one enhancement, a user of a mobile device may utilize a camera of the mobile device to capture an image of any given product while shopping at a store. The mobile device may then initiate a visual search algorithm within a set of archived feature descriptors for various reference images to identify the product shown in the image (which may be referred to as a “search image”) based on matching reference imagery. After identifying the product, the mobile device may then initiate a search of the Internet and present a webpage containing information about the identified product, including a lowest cost for which the product is available from nearby merchants and/or online merchants. In this manner, the user may avoid having to interface with the mobile device via keyboard (which is often a “virtual” keyboard in the sense that it is presented on a touch screen as an image with which the user interfaces) or other input mechanism but may merely capture a search image to initiate the visual search and subsequent web searches.
While there are a number of applications that a mobile device equipped with a camera and access to visual search may employ, visual search algorithms for implementing visual search, such as a scale invariant feature transform (SIFT) algorithm, may be deficient in terms of performing feature matching. Feature matching refers to an aspect of visual search algorithms during which search feature descriptors extracted from the search image are matched against the reference feature descriptors extracted from the reference images.
To illustrate these deficiencies, consider the SIFT algorithm, which may discard reference feature descriptors that would otherwise match a search feature descriptor in instances when the search feature descriptor and the reference feature descriptors each are extracted from a repeating feature of the search and reference images, such as distinctive arches or windows that repeat across a building. Moreover, the SIFT algorithm commonly only returns a single image in response to any given visual search, where this returned image is determined to be the “best match” algorithmically by the SIFT algorithm. However, users may not determine what constitutes a “best match” in the same way as the SIFT algorithm, which may result in user frustration as the single SIFT best match result may not match the user's expectations.