1. Field of the Invention
The present invention generally relates to distributed computing systems, and more particularly to systems and methods that utilize dynamic cost models to schedule usage of computing resources within such systems.
2. Relevant Background
Distributed or grid computing generally refers to the use of a collection of distributed, heterogeneous computing resources that may be spread across shared networks and/or geographic areas to satisfy what may be very large computing tasks or demands. For instance, a “compute” or “server” farm may include a plurality of complete computers or servers (e.g., each with onboard CPUs, memory, storage, power supplies, network interfaces, and the like) that are connected to one or more networks (e.g., LAN, WAN, Internet) by any appropriate conventional network interface(s). In a distributed or grid computing environment, the various disparate computers and systems in an organization or among organizations can be organized and managed as a grid to become one large, integrated computing system. The single integrated system can then handle problems and processes too large and intensive for any single computer of the organization(s) to easily handle in an efficient manner. The resources of one or more of such farms may be appropriately leveraged by jobs or “workload items” of one or more organizations or entities over one or more networks. Such jobs and workload items may take many forms such as particular applications that need to be executed, tasks that need to be performed (e.g., providing help desk support for a period of one year), and the like. If managed properly, grid computing can result in reduced cost of ownership, aggregated and improved efficiency of computing, data, and storage resources, and enablement of virtual organizations for applications and data sharing.
Computing over such farms has garnered interest from various industries over time and has undergone some transformations due to shifting focus from simple compute farms to larger scale grids and most recently to very large scale setups often referred to as compute clouds. Massive amounts of workloads may be submitted into such a compute farm, or cloud, with associated service level agreements (SLAs) and other policies and constraints. Products are available from various vendors as well as from open source projects to address such SLAs, policies, constraints, and/or the like. A recent incarnation in the form of compute clouds (i.e., cloud computing) focuses mainly on hosting services that deliver compute capacity for interested users in a more elastic fashion whereby an amount of resources provisioned for a given user or group scales up and down based on demand. In this regard, the user pays for resources actually consumed in a manner similar to paying for utilities such as domestic electric power, natural gas, and the like.
A core part of a system that provides compute farm/grid/cloud services is the distributed resource scheduler. The scheduler typically evaluates all available resources (e.g., processing capacity, available memory, and the like) against the requested resource usages of incoming workload items (as well as existing SLAs, policies, constraints, and the like) as part of building a schedule of workload execution (i.e., which workload items have priority to resources of the grid or farm relative to other workload items). When demand exceeds available resources, some workload items may have to wait for later execution. Other criteria may also make some workloads wait for later execution such as SLAs that specify calendar time or other constraints which can only be met at a later time.
Some farms choose to allocate costs for consuming farm resources to users according to a specific monetary amount per unit time in relation to a particular type of resource (e.g., a user may be charged $0.10 per hour of CPU, network, storage, or other services or resources consumed). Other farms may choose not to charge actual currency but incorporate other ways of sharing resources such as a “fair share” mechanism whereby various organizations or departments using the shared farm are given a percentage of the total computing “pie”.