The invention relates generally to a server that searches for digital documents in an interactive and visual way. Examples of digital documents include: photographs, product descriptions, or webpages.
More specifically, this invention relates to a server using Bayesian-based techniques to greatly shortcut the search iterations that are required in order to identify a document from a large library of documents. As a result, the server can complete the search task with reduced processing load and bandwidth usage.
Current computer search technologies allow servers respond to search queries with an ordered list of results. The queries may be in the form of a structured query language, natural language text, speech, or a reference image. However, the results returned often are unsatisfactory. The problem is particularly acute where the server is looking for a document which has particular visual characteristics, which may not be easily articulated in words or by a description of how the target document differs from a reference image. Stated simply, someone must see the final identified document in order to determine whether the search was successful. As a result of this uncertainty, the server can be inefficient at identifying the document and can consume an unnecessarily high level of resources (e.g., bandwidth and processing).
For example, the server may be searching for a single particular photograph. If the library is not pre-annotated with the photo characteristics that are sufficient to uniquely identify the target photograph, it would be extremely difficult for the server to find such a photograph. The server may be relegated to showing images iteratively to a user until the user indicates success. As another example, the server may be searching through a vendor's catalog of shoes for a shoe design which might be aesthetically pleasing to a particular user. Unless the server can receive, from the user, an articulation of what constitutes an aesthetically pleasing design, the server is relegated to consuming an unnecessarily high level of bandwidth and processing power in order to provide an extended ongoing iterative browsing experience to the user. As yet another example, the server may be searching the web in order to find a web page that might look interesting to the user to read. Again, unless the server knows beforehand what subject matter will be interesting to the user at that moment, the server is relegated to consuming an unnecessarily high level of bandwidth and processing power in order to provide an extended ongoing iterative browsing to the user, forcing the server to occupy users' time and consume resources while receiving random clicks through various links, which may be the only way to find the target web page. As yet another example, the server may be searching for apparel or accessories that look nice with other apparel already owned by a particular person. As yet another example, the server may be searching for images similar to a prototype image, but different in ways that are not easily articulated. Again, without being able to receive an exact articulation of characteristics of the target document, current computer search technologies implemented by the server may not be able to help.
Some libraries are annotated with metadata, such as date and location (for a photo library), or type and features of products (for a product catalog). But many are not annotated, and even those which are may not be sufficiently specific to allow the server to efficiently hone in on the desired document quickly. As touched on above, some search technologies allow the server to perform searches iteratively, thereby gradually narrowing the field of possible documents until the target document is found. But these often still take a long time, and cause the server to consume an unnecessarily high level of bandwidth and processing power by requiring the server to offer many different collections of candidate documents before the target document is found.
What is needed is an iterative visual search method that can allow the server to significantly shortcut the search process, allowing the server to locate the target document much more quickly. Methods according to aspects of the invention as described herein allow the server to search for the target document iteratively, and are much better at selecting a useful field of candidate documents at each next iteration. Methods according to aspects of the invention can thereby reduce the number of search iterations processed by the server on the order of 40% or more, which in turn greatly reduces the consumption of bandwidth and processing power required by the server.