One problem in information retrieval is the lexical gap between query words and the words in documents to be retrieved in response to the query. Query expansion seeks to address this problem by expanding the original query in an attempt to produce a variant of the original query that will help the search engine to find more relevant documents.
For example, original queries have been expanded using similar variants of the query terms. For example, lexical databases (e.g., the WordNet® database) have been used to find synonyms of query words, and those query words have been used to expand the original query. Such words may be ones with high co-occurrence to the query terms, or frequent words from top-ranked retrieved documents. Additionally, some techniques have considered the original query and its alteration candidates as translation pairs, and statistical machine translation models have been used to rank these candidates according to translation probabilities. For example, the use of a word-based translation model for ranking has been performed by assuming that the alteration words are independent to each other; and that each alteration word is aligned and generated from only one query word.