Content such as text, pictures, audio, video or other data formats can be stored on multiple storage systems connected to a communication network (e.g., the Internet). Such content is typically logically represented as one or more electronic files. A search engine may be used to identify files that satisfy a certain search query submitted for accessing certain content. For a search engine to operate efficiently, the content needs to be indexed in advance. A software mechanism (e.g., a crawler) can be used to crawl the different files or sites on the network for content. The term crawling refers to the process of collecting content of multiple files for indexing so that the content can be searched.