The amount of information and content available on the Internet continues to grow rapidly. Given the vast amount of information, search engines have been developed to facilitate searching for electronic documents. In particular, users may search for information and documents by entering search queries comprising one or more terms that may be of interest to the user. After receiving a search query from a user, a search engine identifies documents and/or web pages that are relevant based on the search query. Because of its utility, web searching, that is, the process of finding relevant web pages and documents for user issued search queries has arguably become the most popular service on the Internet today.
Search engines operate by crawling documents and indexing information regarding the documents in a search index. Search indexes are often comprised of posting lists for the various terms found in the crawled documents. Each posting list identifies the documents in which a particular term was found. When a search query is received, the search engine employs the search index to identify documents relevant to the search query. Use of a search index in this manner allows for fast retrieval of information for queries. Without a search index, a search engine would need to search the corpus of documents to find relevant results, which would take an unacceptable amount of time.
When performing searches, search engines typically employ various mechanisms to provide fast lookup of search query terms in order to locate and retrieve posting lists. Tree-based structures and in-memory hash tables are examples of common methods used for this purpose. However, these existing mechanisms are not truly efficient when scaling up to very large numbers of indexed objects, and the lookup performance may not be sufficient.