Searches among networks and file systems for content have been provided in many forms but most commonly by a variant of a search engine. A search engine is a program that searches documents on a network for specified keywords and returns a list of the documents where the keywords were found. Often, the documents on the network are first identified by “crawling” the network.
For retrieving documents in a crawl, an operation for each document on the network is executed to get the document and populate the index with records for the documents. Security vulnerabilities exist in such a search system. Often, documents coming from the Internet should not be trusted as they may be malicious or specially crafted to expose one of the vulnerabilities. Certain parts of the search and indexing process may have security flaws that expose different risks ranging from private information disclosure to complete takeover of a user's machine.