Search engines are faced with dynamic variations in the query traffic generated by users. Many frequent queries to a search engine may be answered quickly by keeping corresponding results stored in cache machines. However, results not found in cache are then generally directly resolved such as by a set of processors (nodes) forming a cluster. An aim is to determine the top-R results per query as quickly as possible and, from these results, build up the answer web pages presented to the users. For high query traffic and given the huge volume of data associated with the web samples kept at each node, this can involve the use of a significant amount of resources (such as processor utilization, disk and network bandwidth}. Current search engines deal with peaks in traffic by including hardware redundancy, typically enough so that at normal traffic, the processor utilization is below 30% or 40%.
Conventionally, search engines use a standard multiple master/slave paradigm to process queries arriving from broker machines. Typically, each query is associated with a different master thread that is in charge of producing query results, which in turn can contact slave threads located in other processors to get document information. This multi-threaded setting can result in significant performance degradation in situations of sustained high traffic of queries or even sudden peaks in traffic. This can lead to unstable behavior or increased query response times. The situation may be exacerbated when the on-line insertion of documents into the database and index is considered. In this scenario, dealing with a large number of threads can involve overhead from sources such as thread scheduling, costly thread to thread communication and synchronization, and control concurrency of readers and writers by means of locks.