The present invention relates in general to searching for information on a network, in response to an image query sent by a user from a mobile communications device with a camera. In particular, the invention relates to the process of searching for information corresponding to the text in the image query on the network.
A network includes a group of domain databases. The network is accessed by the user, using a communication medium, with a mobile communications device. Each of the domain databases has an information log with specific information. For example, in case of a product search, the specific information may include the product tag, the price tag, the store label or other identifying information related to the product.
The user searches the network for the specific information by using the mobile communications device. Examples of the mobile communications device include mobile phone, personal digital assistant (PDA), and the like. The mobile communications device provides the user with the facility of communicating within the network. Moreover, the mobile communications device can capture images. However, the images, captured by the mobile communications device, may be poor in quality. For example, the images may have poor contrast, poor resolution, may be blurred, and may have intensity variations. Thus extracting information from these images to use as an image query is challenging and error prone. For example, text detection and recognition from such poor quality images often produces a number of errors.
A search must, therefore, be robust to errors produced in the text extracted from the images. Additionally, the search must be rapidly done even if the queries are long (usually queries may vary in length from a few words to a few hundred words). For example, a store or product label may have a lot of extra information about the product besides the brand and model numbers. It may include detailed specifications. Such information along with the text extraction errors makes the queries long and poorly specified and can reduce the accuracy of the search if not properly handled.
There are several methods for searching for information on the network. One such method involves the use of image based searches. The image based searches involve searching for information corresponding to an image captured by a mobile communications device in one of the domain databases. This domain database is selected based on the image. The method retrieves and sends the identified results corresponding to the image to the user. However, the method only searches for matches corresponding to the image in the domain database and does not carry out searches for the contents present in the image. For example, an image may include a logo of a company and a textual part as well. The image based search searches for exact matches corresponding to the specified logo of the company. Valuable information such as company name or address that may be included in the textual part is not found by the image based search. Further, any background will confuse the image based search. Hence, the image based search is incapable of searching for content which is not pre-specified in the image database or collection. In the context of this application, the “content” is only a portion of the image. The portion of the image may include logos of companies, alphanumeric characters of some language, product labels of the products, and the like. The alphanumeric characters in the image can be written in any of the languages, such as English, Chinese, French, and Spanish. Further, the image based search is incapable of detecting matches for the text present in the image.
In light of the foregoing discussion, there is a need for a method and system for searching information that automatically searches in a domain database, based on the content information of the image. In this application, the “content information” is the text present in the image and the information about the text geometry such as size, and location of the text present in the image. For example, if an image includes text, then the content information includes information pertaining to the text, i.e. ASCII or Unicode interpretation of the text, and the size and location of the text within the image. Such a method and system would eliminate the necessity of editing or writing keywords to search for the specific information on the network. This would make the method for searching for information considerably simpler, as well as automatic. Further, the method would include detection and recognition of the text in the image and a search for matches corresponding to the detected and recognized text in the domain database.