1. Technical Field
The present disclosure relates to managing a compute environment or more specifically to a system and method of modifying or updating a compute environment using system jobs. One embodiment of the disclosure relates to rolling maintenance on a node-by-node basis within the compute environment.
2. Introduction
The present invention relates to a system and method of managing resources in the context of a compute environment which may be defined as a grid or cluster of computers. Grid computing may be defined as coordinated resource sharing and problem solving in dynamic, multi-institutional collaborations. Many computing projects require much more computational power and resources than a single computer or computer processor can provide. Networked computers with peripheral resources such as printers, scanners, I/O devices, storage disks, scientific devices and instruments, etc. may need to be coordinated and utilized to complete a task.
Grid/cluster resource management generally describes the process of identifying requirements, matching resources to applications, allocating those resources, and scheduling and monitoring compute resources over time in order to run applications or compute jobs as efficiently as possible. Each project will utilize a different set of resources and thus is typically unique. In addition to the challenge of allocating resources for a particular job, administrators also have difficulty obtaining a clear understanding of the resources available, the current status of the environment and available resources, and real-time competing needs of various users. General background information on clusters and grids may be found in several publications. See, e.g., Grid Resource Management. State of the Art and Future Trends, Jarek Nabrzyski, Jennifer M. Schopf, and Jan Weglarz, Kluwer Academic Publishers, 2004; and Beowulf Cluster Computing with Linux, edited by William Gropp, Ewing Lusk, and Thomas Sterling, Massachusetts Institute of Technology, 2003.
It is generally understood herein that the terms grid and cluster are interchangeable in that there is no specific definition of either. In general, a grid will comprise a plurality of clusters as will be shown in FIG. 1. Several general challenges exist when attempting to maximize resources in a grid. First, there are typically multiple layers of grid and cluster schedulers. A grid 100 generally comprises a group of clusters or a group of networked computers. The definition of a grid is very flexible and may mean a number of different configurations of computers. The introduction here is meant to be general given the variety of configurations that are possible. A grid scheduler 102 communicates with a plurality of cluster schedulers 104A, 104B and 104C. Each of these cluster schedulers communicates with a plurality of resource managers 106A, 106B and 106C. Each resource manager communicates with a series of compute resources shown as nodes 108A, 108B, 108C, 108D, 108E, 108F, 108G, 108H, 108I.
Local schedulers (which may refer to the cluster schedulers 104A, 104B, 104C or the resource managers 106A, 106B, 106C) are closer to the specific resources 108 and may not allow grid schedulers 102 direct access to the resources. The resources are grouped into clusters 110, 112 and 114. Examples of cluster resources include data storage devices such as hard drives, compute resources such as computer processors, network resources such as routers and transmission means, and so forth. The grid level scheduler 102 typically does not own or control the actual resources. Therefore, compute jobs are submitted from the high level grid-scheduler 102 to a local set of resources with no more permissions that the user would have. Compute jobs may also be submitted at the cluster scheduler layer of the grid or even directly at the resource managers. There are problems with the efficiency of the arrangement.
The heterogeneous nature of the shared resources causes a reduction in efficiency. Without dedicated access to a resource, the grid level scheduler 102 is challenged with the high degree of variance and unpredictability in the capacity of the resources available for use. Most resources are shared among users and projects and each project varies from the other. The difference in performance goals for various projects also reduces efficiencies. Grid resources are used to improve performance of an application but the resource owners and users have different performance goals: from optimizing the performance for a single application to getting the best system throughput or minimizing response time. Local policies may also play a role in performance.
FIG. 2 illustrates a current state of art that allows a scheduler/resource manager combination to submit and control standard batch compute jobs. An example of a batch job is a request from a weather service to process a hurricane analysis. The amount of computing resources are large and therefore the job is submitted to a cluster for processing. A batch job is submitted to the queue of a resource manager and is constrained to run within the cluster associated with that resource manager. A batch job 204, 206 or 208 within a queue 202 has the ability to have a number of steps in which each step may have dependencies on other steps, successful or failed completion of previous steps or similar relationships. The bounds of influence for the batch jobs are limited to running non-root applications or executables on that cluster or on compute nodes that are allocated to it.
The respective batch job is unable to do anything outside of the constrained space for the job. There are a number of deficiencies with this approach, particularly in that such a job is unable to modify the scheduling environment. The job is only able to operate within the scheduling environment and it is also constrained to only doing the specified actions. For example, the job may be constrained to run an executable within a compute node of the cluster (within its allocated space), but it is unable to run any other action within the cluster or within the other services of the cluster.
What is needed is a method by which a processing entity that can be queued can be submitted to the scheduler that can be more flexible and have a broader scope of impact on the compute environment.