Search engines are tools used to perform searches based on a query and return results or hits either as electronic documents or links to electronic documents that match the query. Due to the proliferation of information available in electronic sources, it has become increasingly important to avoid overwhelming a user with information likely to be irrelevant. Moreover, it is increasingly important to aid the user in navigating the relevant information more efficiently.
In service of these goals, most search engines contain an internal algorithm to determine which documents are most relevant to the search query. For example, some search engine algorithms consider both the frequency and the positioning of keywords in document to determine relevancy. If a keyword only appears once in a footnote to a document, there is a relatively small chance that the document contains the information a user is seeking by searching for the keyword. In contrast, if keywords appear repeatedly in a document and appear in the document title or headings, there is a relatively greater likelihood that the document contains information about the keywords that would be relevant, and hence useful, to the user. Whereas the first would result in a low relevancy number associated with the document, the latter would result in a higher relevancy number. Although general principles such as these are known in the art of search engine software engineering, the exact algorithms used by a search engine to determine a relevancy metric are usually proprietary information protected as trade secrets.
Once a search is run and relevancy is calculated for each result, the results are then sorted according to the relevancy metric, and the results list can be displayed in this order to the user. Since the most relevant results appear at the top of the list, users who begin scanning results encounter first those results most likely to contain useful information.
Some conventional search engine systems display a given number of search results, such as the top 100 results for a search, regardless of what search query is entered. If a search is too broad, the user may end up wasting time reading too many results in an attempt to find a real result of interest, or worse, the real result of interest might not have made the top 100 list. If a search query contains a spelling mistake, typographical error, or a terminology difference, the results returned may be entirely irrelevant to the user. With conventional search engine systems, the user must figure out how to change in the query in order to improve results. A higher quality search (i.e., one that is not too broad, too narrow, containing a spelling mistake, or the like) would help them locate desired information more efficiently.
Some search engines display the relevancy metric, which savvy users may use to interpret the quality of their results and modify their queries accordingly. An example of a search engine that displays relevancy metrics associated with each result is AnswerWorks® by Vantage Software Technologies. However, either from lack of familiarity with the relevancy metric or a failure to understand the significance of relevancy patterns in results, users may still not know how to improve their results or even that it would benefit them to try to do so.
Accordingly, systems and methods are needed to improve users' search experiences by helping them to more quickly locate desired results. There is a need for methods and systems that recognize certain indications of the nature or quality of the search query. There is also a need for systems and methods that aid users in improving their searches based on these determinations.