In a visual search, a user points a camera of a device such as a phone at an object and has the phone recognize the object using an image captured by the camera. Once the object is recognized, the phone can take a multitude of actions. For example, the object might be a building, and the phone could present the user with additional information such as the name, address, occupants, or architecture of the building, or search results, e.g., from the Internet, pertaining to the building. Illustratively, the results could be of restaurants in or near the building. As another example, the object could be a poster for a movie and the phone could present information about the movie, a trailer for the movie, local theaters showing the movie, and the like. As yet another example, the object could be a barcode for a product, and the phone could return a detailed description of the product, nearby stores having the product, and the prices of the product at those stores.
In order to recognize an object, the phone can access a visual search database, typically on the phone. The visual search database may also be at a remote server. The images in the visual search database are commonly called tags. These images are either captured with a dedicated device by a professional service team, or are provided by multiple sources using a variety of cameras/devices.
For image matching, the captured image is saved in the memory. Then this input image is matched against the images in the visual search database. Typically, the phone will present multiple possible matches to a user for confirmation as to which of the images (if any) is a match to the object the user captured with his or her image. When an object in a presented image matches the object in the user-taken image, information associated with the object will be displayed. Typically, the visual search database contains images of all possible objects the user may need to recognize.
This type of visual search has many benefits, but could be improved.