The present invention relates to cloud computing, and in particular to computing resource allocation in cloud computing.
In most modern enterprises analyzing large amounts of data efficiently and quickly is important. One analysis tool is a map-reduce style of program which includes a map-phase, a shuffle-phase and a reduce-phase. In one example, in the map-phase a primary node divides input data (i.e., problem or workload) into subsets and distributes the subsets to processing nodes, wherein each processing node computes an intermediate output. In the reduce-phase, the processing nodes combine the results for all the subsets to form an output representing the results (i.e., answer) to the input data. Between the map-phase and reduce-phase, in a shuffle-phase the data are shuffled (i.e., sorted and exchanged between nodes) in order to move the data to a node that reduces them. The shuffle-phase generates traffic and consumes network bandwidth. A map-reduce implementation by Google is described in U.S. Pat. No. 7,650,331. The Apache Hadoop project provides a similar map-reduce known as Hadoop.
Map-reduce as a service in cloud computing provides a usage model for enterprises, allowing enterprises to analyze large amounts of data without creating large infrastructures of their own. The cloud provider manages multiple map-reduce workloads executing concurrently. Network load is of special concern with map-reduce workloads as large amounts of traffic can be generated by map-reduce phases.