A typical search engine serves user queries by retrieving most relevant documents containing the requested keywords from a vast set of possible candidates. The process of query resolution consists of two basic steps: determining the complete set of candidate documents that contain the keywords (also referred to as the filtered set of documents), and second, computing a ranking score for each of the documents in the filtered set, sorting the documents according to the ranking score, and retrieving the top N (typically 10-50) from the ranked list. The ranking score is determined by a ranking function, which is the core component of the search engine.
A ranking function takes multiple input values, also called features, that were extracted during the indexing process and maps all these features to a single numerical score. The features can be extracted from the document or the document metadata (e.g., term frequencies in the body of the document or in the metadata), or could be a result of more complicated analysis of the entire corpora with respect to the particular document (e.g., document frequency of the terms, aggregated anchor text, page rank, click distance, etc.). Generally, the ranking function grows monotonically, with the expected probability of the document being relevant given a particular query.
The ranking features can depend on the query (e.g., term frequency of the query term in the document), or be query independent (e.g., page rank, or in-degree or document type). The query-dependent features are called dynamic, and are computed at query time. The query-independent features are static, and can be pre-computed at index time. It is also possible to pre-compute the combination of all static features given a ranking model to save computation costs.
The ranking function is usually not hardcoded, and is designed to have many parameters that can be configured depending on the desired result. The set of parameters is called the ranking model. The ranking model parameters are typically the weights used to combine the input features into the ranking score. The weights can be tuned to optimize the performance of the ranking function with respect to some relevance metric.
Typically, the tuning is done offline over a dataset that consists of 100s-1000s of evaluation queries and a set of test documents that would be returned by the engine for these queries, with the corresponding ranking features extracted beforehand. An automatic tuner (e.g., neural net) can be employed that performs a search over the vast parameter space to optimize the relevance metric over the evaluation set. The resulting ranking model is then shipped with the product. For a consistent ranking every document has to be scored with the same ranking model, but this does not mean that the ranking model cannot change from query to query. Typically, though, the search engine has a single ranking model applied for all queries.
This approach implies that the set of features and the model itself are substantially fixed once the product is released. Moreover, the approach assumes that the evaluation dataset is representative of every possible corpora where the search engine can be used, which is clearly not true. In different environments users would like to customize the ranking, because of specialized domain knowledge the user has but was not considered when the evaluation dataset was built.