The use of search engines has become a part of everyday life. Users use search engines to find information electronically from various information sources. For example, the American legal system, as well as some other legal systems around the world, relies heavily on written judicial opinions, the written pronouncements of judges, to articulate or interpret the laws governing resolution of disputes. Each judicial opinion is not only important to resolving a particular legal dispute, but also to resolving similar disputes, or cases, in the future. Because of this, judges and lawyers within our legal system are continually researching an ever-expanding body of past opinions, or case law, for the ones most relevant to resolution of disputes.
To facilitate these searches West Publishing Company of St. Paul, Minn. (doing business as Thomson West) collects judicial opinions from courts across the United States, and makes them available electronically through its Westlaw® legal research system. Users access the judicial opinions, for example, by submitting keyword queries for execution by a search engine against a jurisdictional database of judicial opinions or case law.
Typically, search engines maintain information concerning what queries a user may have entered, the documents that were identified and viewed from the search, the actions taken with documents, such as viewing, printing, etc., whether an advertisement or sponsored link provided with search results was selected, and other information in one or more query logs.
While information in query logs can be valuable in determining the relevance of search results to entered user queries, and therefore, the effectiveness of a search engine to identify relevant documents, current techniques in analyzing query log data do not overcome the inherent quality issues of this data, namely, that query log data tends to be noisy, sparse, incomplete, and volatile.
Accordingly, there is a need for improvement of information-retrieval systems for document retrieval systems that can effectively leverage query log data.