The execution of workflow applications is a reality today in enterprise and scientific domains. To realize the potential of increased revenue by collective management of information technology (IT) resources, execution of these workflows on grid resources have assumed importance. The existing core middleware technologies for grids (for example, meta-schedulers) include sophisticated resource matching logic but lack control flow orchestration capability. Existing workflow orchestrators, on the other hand, suitably control enterprise logic but are unaware of execution requirements of tasks. Marriage of the scheduling technology with workflow management is thereby advantageous in the design of middleware for geographically distributed grids spanning organizational domains.
Existing endeavors concentrate on intra-domain workflow execution and use ad hoc, non-layered, non-standard solutions that reversely affect cross-organizational collaboration. In addition, existing approaches lack support for efficient data handling, especially crucial for performance of data intensive workflows in distributed data scenarios.
Also, existing approaches in workflow scheduling includes treating workflow orchestration and scheduling as separate activities, as well as handling only one workflow at a time. Additionally, in existing approaches, a scheduler computes mappings for each workflow without knowing the set of workflows to be executed, and the sharing of resources is not optimized between the batches. Existing approaches also include non-trivial extension to accommodate multiple workflows, and a scheduler that cannot control execution of the batch of workflows. Further, existing approaches include orchestrators that cannot honor schedule ordering of jobs across workflows. Consequently, a scheduler and orchestrator should advantageously integrate to handle this, but, however, existing approaches do not integrate such activities.
Existing approaches, for example, can include workflow orchestration such as Taverna (a tool for the composition and enactment of bioinformatics workflows), WS-BPEL (business process execution language for web services), and Yawl (a flow language). All such approaches are languages for workflow modeling and orchestration that, however, do not consider how the workflows are mapped to lower level resources.
Additionally, existing approaches can include, for example, workflow scheduling such as scheduling data-intensive workflows onto storage-constrained distributed resources. Such approaches can also include, for example, Pegasus, which is a framework for mapping complex scientific workflows onto distributed systems. Pegasus, however, does not provide support for multiple workflows.
Other existing approaches can include, for example, cost-based scheduling of workflow applications on utility grids. However, the existing approaches do not approach the problem of orchestrating and scheduling batch workflows on a shared set of resources.
Another existing approach includes, for example, Mounties, which is designed for managing applications and resources using rule-based constraints in cluster environments. However, Mounties does not work in the domain of grid jobs and data flows. Also, existing approaches additionally include, for example, event-based scheduling methods and systems for workflow activities. Such approaches, however, do not include integration with resource management or scheduling on available resources.
Existing approaches may not include, for example, a system where multiple independent workflows are optimally scheduled, consideration job and data, run-time adaptations provided for multiple workflows, and/or a dynamic scheduling algorithm for more than one workflow. Furthermore, repeated scheduling using single-workflow algorithms provides sub-optimal results. Also, extending a single workflow algorithm to multiple workflows is non-trivial and disadvantageous because the orchestrator does not know about resource selection across workflows, and the scheduler does not know about flow control of independent workflows.