An information technology (IT) system may include a framework for processing large volumes of data and may store the large volumes of data across a number of computing nodes (“nodes”). The nodes may be logical or physical nodes. In an example, a node may receive a search query from a user to retrieve certain data stored in the nodes. A processing system can process the search query by searching the nodes simultaneously for search results. The processing system can collect and combine the search results to produce final search results. To “combine” search results may refer to grouping, sorting, or transforming the search results, or deriving data from the search results.
For example, common “map-reduce” techniques include separate map and reduce operations to obtain data from nodes and combine the retrieved data, respectively, to generate final results. In such systems, fewer nodes can perform the reduce operations compared to the number of nodes that perform the map operations. For example, under the direction of a single node, other nodes can perform map operations on locally stored data, and write outputs to a temporary storage. The outputs may be distributed among the nodes, and subsequently received by the single node. Then, in the reduce operation, the single node can combine the outputs to generate final search results.
These existing techniques have limited efficiency and scalability because the nodes performing the combining operations can reach and exceed capacity rapidly as the number of nodes being searched increases. As a result, obtaining search results from large volumes of data across numerous nodes can fail.