Technical Field
The present disclosure relates to virtualized compute infrastructure in both public, private and hybrid public-private environments, and to a method and system for the automatic, realtime, optimization of that virtualized compute infrastructure based on characteristics of the jobs intended to run within the environment.
Description of Related Art
Cloud based resources are typically virtualized compute environments in which a series a physical servers are provisioned into one or more virtual servers available for temporary use, each utilizing a subset of the total resources of a physical machine. When using one or more infrastructure-as-a-service (IaaS) provider(s), the user typically has control over the desired configuration of the virtualized infrastructure. Examples of this control may include the ability to choose virtual hardware from a list of predetermined configurations or the ability to specify the exact resources required for the desired virtualized hardware configuration. Cloud-based computational infrastructure typically has different pricing points based on the resources provided by the different virtualized machine types: the more resources provided, the higher the cost per unit of time.
Traditional compute environments are typically of fixed size and limited heterogeneity due to being based on limited hardware configurations at the time of purchase. The result being that the machines may or may not be capable of performing the work a job requires due to platform incompatibilities, disk space, memory or other limiting factors such as network interconnect. Conversely, it is also possible that a job running on a single machine underutilizes the machine resulting in wasted resources.
Current auto-scaling compute resources for an application, such as shown in U.S. Pat. No. 8,346,921, are focused on predictive auto-scaling based upon historical usage, of a single virtual machine configuration in terms of RAM, number of CPUs, etc., or instance type, ignoring situations where many applications are submitted to a compute environment, each requiring their own optimal VM (virtual machine) configuration or multiple VM configuration.
Similar current cloud based solutions, such as CycleCloud, provision virtualized hardware based on the number of idle jobs in a queuing, messaging, or batch scheduling system such as Condor or GridEngine. These solutions require the virtualized configuration of the execution nodes used to run the jobs be determined ahead of time and then automatically provision virtual machines matching this one configuration to execute idle jobs in the queue, message bus, or scheduler. This often results in the need to define a virtual machine configuration capable of running any job in a workflow, the least common multiple machine configuration for all the jobs, even if this configuration has excess resources needed by the majority of the jobs.
It is desirable to leverage the modern, virtualized compute environment's ability to specify different hardware configurations in realtime such that automatic provisioning of infrastructure can be based on the specific needs and characteristics of each job in a queue of jobs. Optimizing the type of virtual instance created by using metadata from queued jobs would allow jobs to complete faster, cheaper, and with fewer errors. This invention describes such a method and system.