1. Field of the Invention
The present invention relates to accessing compute resources within a compute environment and more specifically to providing a threshold-based access to compute resources.
2. Introduction
The present invention relates to a system and method of managing access to compute resources in the context of a grid or cluster of computers. Grid computing may be defined as coordinated resource sharing and problem solving in dynamic, multi-institutional collaborations. Many computing projects require much more computational power and resources than a single computer may provide. Networked computers with peripheral resources such as printers, scanners, I/O devices, storage disks, scientific devices and instruments, etc. may need to be coordinated and utilized to complete a task.
Grid/cluster resource management generally describes the process of identifying requirements, matching resources to applications, allocating those resources, and scheduling and monitoring compute resources over time in order to run workloads submitted to the compute environment as efficiently as possible. Each project will utilize a different set of resources and thus is typically unique. In addition to the challenge of allocating resources for a particular job, grid administrators also have difficulty obtaining a clear understanding of the resources available, the current status of the grid and available resources, and real-time competing needs of various users.
Several general challenges exist when attempting to maximize resources in a compute environment. First, there are typically multiple layers of grid and cluster schedulers. FIG. 1 illustrates this point. A grid 100 generally comprises a group of clusters or a group of networked computers. The definition of a grid is very flexible and may mean a number of different configurations of computers. The introduction here is meant to be very general. The grid scheduler 102 communicates with a plurality of cluster schedulers 104A, 104B and 104C. Each of these cluster schedulers communicate with a plurality of resource managers 106A, 106B and 106C. Each resource manager communicates with a series of compute resources shown as nodes 108A, 108B and 108C. These may be referred to as a cluster or compute environment 110.
Second, local schedulers (which may refer to either the cluster schedulers 104A, 104B, 104C or the resource managers 106A, 106B, 106C) are closer to the specific resources 108 and may not allow grid schedulers 102 direct access to the resources. The grid level scheduler 102 typically does not own or control the actual resources. Therefore, jobs are submitted from the high level grid-scheduler 102 to a local set of resources with no more permissions that the user would have. This reduces efficiencies.
Third, the heterogeneous nature of the shared resources causes a reduction in efficiency. Without dedicated access to a resource, the grid level scheduler 102 is challenged with the high degree of variance and unpredictability in the capacity of the resources available for use. Most resources are shared among users and projects and each project varies from the other.
Fourth, the performance goals for projects differ. Compute resources are used to improve performance of an application but the resource owners and users have different performance goals: from optimizing the performance for a single application to getting the best system throughput or minimizing response time. Local policies may also play a role in performance. Several publications provide introductory material regarding cluster and grid scheduling. See, e.g., Grid Resource Management, State of the Art and Future Trends, Jarek Nabrzyski, Jennifer M. Schopf, and Jan Weglarz, Kluwer Academic Publishers, 2004; and Beowulf Cluster Computing with Linux, edited by William Gropp, Ewing Lusk, and Thomas Sterling, Massachusetts Institute of Technology, 2003. The Beowulf Cluster Computing with Linux reference includes steps to creating a cluster.
Given the challenges associated with the compute environment, administrators have difficulty with regards to establishing operating systems and what operating systems are installed within a cluster. In many cases, clusters have a requirement for more than one operating system, such as the Macintosh, AIX, Microsoft NT, Linux, and so forth. The majority of cases, an administrator or a group of administrators or managers will determine before the fact what particular mixture of these operating systems will be needed to be installed on the cluster nodes. In addition to operating systems, the same challenges exists for other resources within the cluster, such as software applications, memory requirements for each node, and other static or semi-static attributes.
These IT mangers and administrators must make a best estimate of the distribution of their workload and then they set up the cluster accordingly. For instance, within a 64 node cluster, an administrator may assign 48 nodes to one operating system, 12 nodes to another operating system, and 4 more nodes to a third operating system. The administrator must anticipate what the workload will be. The problem with this approach as that as the system comes on line, users begin to submit jobs according to their needs and not necessarily what was configured by the operators and managers.
Load balancing issues can immediately exist between the various partitions which exist by virtue of the different operating systems. Partitions may relate to partitioning one of: operating systems, memory, disk space, a software application, a license or some other compute resource. Partitions may also be soft partitions. One may find that the first operating system is under-utilized while the second and third operating systems are heavily over-utilized and there's nothing that can be done about it.
The cluster scheduler 104A, 104B, 104C simply does a matching policy to figure out if a job comes in and requires a particular operating system, and which node or set of nodes are best for running the job. If the scheduler 104A, 104B, 104C attempts but cannot establish matches between jobs and nodes, it queues the job until a match is available when some other job completes. In this regard, the schedulers are basically static systems that do not make fully intelligent decisions regarding work to be processed in the compute environment.
Another challenge in allocating workload to a compute environment is the reservation process. A scheduler will “reserve” compute resources for a user. The reservation of some of the resources places restrictions on the use of those resources such that they will be available for jobs submitted by that user. In other words, a user may reserve, for example, 16 nodes out of the 64 nodes within a cluster for use at 10 am on Tuesday. Having this reservation guarantees that the requestor will have access to those resources at the appointed time. However, by making the reservation, the scheduler to a certain degree ties up those resources and restricts their availability and use for other users. If the requestor then underutilizes those resources, then they are not efficiently used.
What is needed in the art is a way to allow the administrative control over a compute environment to be more flexible in how access is granted to the compute environment such that efficiency can be improved.