An effective framework involves distributed parallel computing, which operates to disperse processing tasks across multiple processors operating on one or more computing devices such that parallel processing may be executed simultaneously. Important implementations of large scale distributed parallel computing systems are MapReduce by Google®, Dryad by Microsoft®, and the open source Hadoop® MapReduce implementation. Google® is a registered trademark of Google Inc. Microsoft® is a registered trademark of the Microsoft Corporation in the United States, other countries, or both. Hadoop® is a registered trademark of the Apache Software Foundation.
Generally, MapReduce has emerged as a dominant paradigm for processing large datasets in parallel on compute clusters. As an open source implementation, Hadoop has become popular in a short time for its success in a variety of applications, such as social network mining, log processing, video and image analysis, search indexing, recommendation systems, etc. In many scenarios, long batch jobs and short interactive queries are submitted to the same MapReduce cluster, sharing limited common computing resources with different performance goals. It has thus been recognized that, in order to meet these imposed challenges, an efficient scheduler can be helpful if not critical in providing a desired quality of service for the MapReduce cluster.
Conventionally, efforts have been made to effect such service, but significant shortcomings have been noted. First, it is usually the case conventionally that only the fairness of the map phase is guaranteed, without guaranteeing fairness for reducers in that reducers are launched greedily to a maximum. As such, to allocate excess computing resources without balancing with map progress can will lead to underutilization. Secondly, most of the scheduling schemes on data locality only consider local inputs for map tasks and ignore a need for intermediate data generated from mappers to be fetched by reducers, through either networks or local disks. A consequence is that future run-time information becomes unavailable when scheduling reducers. A greedy approach for launching reducers, as undertaken by conventional schedulers, can make wrong decisions at the beginning in its detachment from evolving job dynamics.
In conventional arrangements where a type of delayed scheduling is used to improve the data locality for map tasks, it can be seen that the introduced delays degrade performance in heterogeneous environments, e.g., when the input data are not distributed evenly over a large fraction of nodes in the computing cluster. This can cause under-utilization and instability, in that the number of mappers running simultaneously will not reach a desired level, and will vary greatly over time.