1. Field of the Invention
The present invention relates to distributed computing and more particularly to scheduling of tasks comprising a job in different nodes of a distributed computing system.
2. Description of the Related Art
Distributed computing refers to the distribution of a computing tasks to one or more selected nodes amongst several nodes in a computing cluster within a network environment. Early forms of the distributed computing model were embodied by a cluster of nodes—generally complete servers or server images—wholly within the control of a central operator. With the advent of Web services and grid based computing, modern forms of distributed computing are embodied by the cloud computing model. Cloud computing generally refers to a collection of nodes disposed upon the global Internet with disparate controlling entities providing processing on demand as a service to different requestors usually for a fee. Irrespective of the nature of a distributed computing system, scheduling of jobs to different nodes remains paramount to the operation of a distributed computing system.
Task scheduling relates to the determination of one or more nodes in a cluster both able and also desired to process a selected task or tasks in a manner most likely to meet the processing objectives of the distributed computing system. By completing the tasks, the distributed computing infrastructure completes the job comprised of the tasks. In the most basic circumstance, the least taxed node in a cluster is assigned a pending task so as to perform the task in a most expedient manner. In a more sophisticated circumstance, the resources of available nodes can be matched to the task at hand to ensure efficient processing of a job, regardless of whether efficient processing refers to the fastest time to process the job, the most energy-efficient time to process the task, or the lowest cost to process the task. In an even more sophisticated circumstance, the assignment of tasks to nodes in a cluster can be balanced to ensure that no one node receives a burdensome portion of the load. Finally, the terms of a service level agreement (SLA) for a particular customer's job are associated with a task to be scheduled can be taken into account when selecting available nodes to process the task.
While the foregoing assumes discrete tasks scheduled for processing in different nodes of a computing cluster, not all computing can be related to the processing of discrete tasks. Rather, in some circumstances, a collection of tasks that alone lack stand-alone meaning, can combine as part of a larger job of meaning A prime example includes tasks resulting from parallel computing in which a problem is subdivided into multiple tasks and assigned different nodes for processing the tasks in parallel. The result set from the processed tasks subsequently are combined to produce a result for the job. Map/Reduce computing is a technology outgrowth of parallel computing as it relates to computation processing.
As it is well known, Map/Reduce has two main components a “Map” step and a “Reduce” step. In the “Map” step, the master node accepts input, chops the input into smaller sub-problems (tasks), and distributes those smaller sub-problems to correspondingly different worker nodes. (A worker node may do this again in turn, leading to a multi-level tree structure). The worker node in turn processes that smaller problem, and passes the answer back to its master node. Thereafter, in the “Reduce” step, the master node then takes the answers to all the sub-problems and combines them in a way to get the output—the answer to the problem it was originally trying to solve.
One advantage of Map/Reduce is that Map/Reduce allows for distributed processing of the map and reduction operations. Provided each mapping operation is independent of the other, all maps can be performed in parallel—though in practice it is limited either or both of the data source and the number of central processing units (CPUs) near that data. Similarly, a set of ‘reducers’ can perform the reduction phase—all that is required is that all outputs of the map operation that share the same key are presented to the same reducer, at the same time. While this process can often appear inefficient compared to algorithms that are more sequential, Map/Reduce can be applied to significantly larger datasets than that which “commodity” servers can handle —a large server farm can use Map/Reduce to sort a petabyte of data in only a few hours. The parallelism also offers some possibility of recovering from partial failure of servers or storage during the operation: if one mapper or reducer fails, the work can be rescheduled—assuming the input data are still available.
Traditional modes of task scheduling in a distributed computing system as in the case of cloud computing can fail in the face of parallel programming methodologies like Map/Reduce. In this regard, traditional modes of task scheduling in distributed computing do not account for the interrelationship and interdependency of different tasks of a computational problem processed in parallel across different jobs. Rather, all decision making in respect to task scheduling is performed only with a view to the nature of an individual, atomic job to be assigned to a particular node. Consequently, the undesirable result can occur where an individual job estimated to require a duration of processing in a node that exceeds the duration of processing required for a different job will not be prioritized over the different job even though the completion of remaining tasks for the individual job will result in the completion of the job before a different job can complete.