Currently, the computer network has been more and more popularized, and the scale of the computer network is still increasingly expanded. There are tens of thousands of servers and memories in the current computer network. Most of computers in the computer network, which typically do not have large storage capacity and bandwidth, are however advantageous of their large number.
In order to execute large-scale tasks in such a computer network, such as computation of large data sets, a distributed parallel computing framework has already been proposed. A computing system using such a parallel computing framework typically comprises a master node, one or more computing nodes, and one or more sink nodes. The master node partitions a large-scale data computing task into multiple small-scale sub-tasks, determines the number of computing nodes and sink nodes for executing the task and their locations, etc., and designates for each sub-task computing nodes and sink nodes for completion of the sub-task. The computing nodes receive and execute the sub-tasks designated thereto and notify the master node of completion of execution of the sub-tasks. The computing nodes transfer intermediate data to the sink nodes designated by the master node thereto based on instructions of the master control mode. The sink nodes perform proper combination based on an intermediate results received from the corresponding computing nodes so as to obtain a computation result directed to the present task. The computing nodes and the sink nodes are logical nodes and can be located on the same or different physical network nodes.
A typical example of such a parallel computing framework is MapReduce which is a software framework released by Google® in 2004 to support distributed computing on large data sets (typically greater than 1 TB) on clusters of computers (related documents can be downloaded from http://hadoop.apache.org/mapreduce/). More and more applications are adopting MapReduce. The MapReduce framework also comprises a Master (i.e., master node), one or more Mappers (i.e., mapping nodes) serving as computing nodes, and one or more Reducers (i.e., reduction nodes) serving as sink nodes.
Throughput is a key factor in MapReduce runtime. Because of the MapReduce framework itself, it is necessary to transfer a large amount of data therein, and for example, the output of Mappers needs to be transferred to specified Reducer in the shuffle phase. Generally speaking, data transfer patterns vary with different applications. But even for the same application, different task scheduling may also lead to different data transfer patterns in the network.
As a result, for many applications, the transfer of an intermediate result through the entire network becomes a throughput bottleneck. One main reason is that the applications have no knowledge of network conditions and cannot control network behaviors, and thus the applications totally rely on the network itself to complete data transfer. Meanwhile, the network has no understanding of data transfer patterns of the applications. Since the network always adopts a fixed data transfer pattern while the data transmission patterns of the applications may change greatly, bandwidth waste and congestion problems are caused in the network transfer.
FIG. 1A shows a problem of the existing MapReduce which is that Reducers at appropriate locations cannot be selected due to the lack of network information in the parallel computing environment. In the MapReduce framework, since a master node on an application layer has no knowledge of relative positional relations between Mappers and Reducers, it cannot select for the Mappers the Reducers closest thereto. FIG. 1B shows another problem of the existing MapReduce that bandwidth waste is caused due to application-layer multicast in the parallel computing environment. A system using the MapReduce framework normally has a network-layer multicast requirement, that is, it is required to transfer the same data from a host to a group of host computers. At present, this requirement is realized by application-layer multicast. However, the application-layer multicast is realized by unicast of the same data on a network layer multiple times. This causes a significant network bandwidth waste, especially when the amount of data is huge. FIG. 1C shows another problem of the existing MapReduce that network congestion is caused without good multi-path support in the parallel computing environment. Since network data transfer of the same Mapper is based on a fixed strategy, there is the possibility that data from the same Mapper is transferred through the same path, resulting in congestion in this path.
Therefore, there is a need in the prior art for the technology that a data transfer pattern varies according to different applications in the parallel computing environment.