Information retrieval (IR) systems have been developed that allow users to identify particular documents of interest from among a larger number of documents. IR systems are useful for finding an article in a digital library, a news document in a broadcast repository, or a particular web site on the worldwide web. To use such systems, the user specifies a query containing several words or phrases specifying areas of interest, and the system then retrieves documents it determines may satisfy the query.
An IR system typically ranks documents with some measure (e.g., score) by the likelihood of relevance to a query. The ranking order is useful in determining whether one document is more relevant than another. Most applications, however, have the selection of relevant documents as their final goals. A ranking order by itself does not provide an indication of whether a document is actually relevant to the query. A large number of documents that are low on the ranking order invariably are provided as a result of the query, despite the fact that these documents probably are not very relevant.
In order to make a decision on the selection of documents that are relevant to the query, a threshold on the scores may be utilized. Scores above the threshold are designated as relevant, and scores below the threshold are designated as not relevant. Previous systems generally use an ad-hoc approach to picking the threshold, such as looking at the top few documents in the ranking order and then setting an arbitrary score to be the threshold.
This method of choosing thresholds, however, makes it difficult to come up with a consistent decision threshold across queries, because the scores assigned documents for one query do not generally relate to the scores assigned documents for a different query. This results in a degradation of system performance for the task. The alternative is to set the threshold for each query, but this is impracticable. Accordingly, there is presently a need for a system that normalizes scores so that a decision threshold is consistent across different queries.