MapReduce is an important programming paradigm for processing large data. Accordingly, there has been work in the area of design of MapReduce schedulers. All existing scheduling approaches have focused on the scheduling of collections of singleton MapReduce jobs, as single MapReduce jobs were originally an appropriate atomic unit of work. More recently, however, more elaborate MapReduce work has emerged, and currently it is common to see the submission of flows of interconnected MapReduce jobs. Each flow can be represented by a directed acyclic graph in which the nodes are singleton MapReduce jobs and the directed arcs represent precedence. This means that the atoms have become molecules: the flows have become the basic unit of MapReduce work, and it is the completion time of the flows that commonly determines the appropriate measure of quality, not the completion times of the individual MapReduce jobs.
Previous parallel scheduling implementations and theoretical results include what are referred to as rigid jobs. These jobs run on a fixed number of processors (also referred to herein as slots) and are presumed to complete their work simultaneously. One can thus think of a job as corresponding to a rectangle whose width corresponds to the number of processors p, whose height corresponds to the execution time t of the job, and whose area, s=p·t, corresponds to the work performed by the job.
Early work focused on the makespan metric, while subsequent parallel scheduling research included additional considerations. One such consideration involved moldable scheduling, wherein each job can be run on an arbitrary number of processors, but with an execution time that is a monotone non-increasing function of the number of processors. Thus the width of a job is changed from an input parameter to a decision variable. Additionally, with respect to another consideration, malleable scheduling, the number of processors allocated to a job is allowed to vary over time. However, each job must still perform its fixed amount of work.
Accordingly, existing approaches for scheduling simultaneous MapReduce work on a distributed cluster of processors typically include slot-based approaches. Such scheduling techniques favor time of arrival, while others favor notions of fairness.
However, MapReduce work is typically initiated in the form of flow-graph applications rather than single MapReduce jobs. These flow-graphs commonly include nodes describing MapReduce jobs, with directed arcs corresponding to precedence relations between the jobs. Additionally, it is often the completion time of the entire flow-graph, as noted above, that is of importance to the user submitting the application. Moreover, the completion times of individual MapReduce jobs themselves are often not overly relevant because the individual MapReduce jobs are commonly steps on a path to a larger goal. Accordingly, a need exists for scheduling management of overall MapReduce flow-graph applications, with the additional goal of optimizing metrics based on completion times of such applications.