Many companies and other organizations operate computer networks that interconnect numerous computing systems to support their operations, such as with the computing systems being co-located (e.g., as part of a local network) or instead located in multiple distinct geographical locations (e.g., connected via one or more private or public intermediate networks). For example, data centers housing significant numbers of interconnected computing systems have become commonplace, such as private data centers that are operated by and on behalf of a single organization, and public data centers that are operated by entities as businesses to provide computing resources to customers or clients. Some public data center operators provide network access, power, and secure installation facilities for hardware owned by various clients, while other public data center operators provide “full service” facilities that also include hardware resources made available for use by their clients. Examples of such large-scale systems include online merchants, internet service providers, online businesses such as photo processing services, corporate networks, cloud computing services (including high-performance computing services for executing large and/or complex computations), web-based hosting services, etc. These entities may maintain computing resources in the form of large numbers of computing devices (e.g., thousands of hosts) which are housed in geographically separate locations and which are configured to process large quantities (e.g., millions) of transactions daily or even hourly.
The advent of virtualization technologies for commodity hardware has provided benefits with respect to managing large-scale computing resources for many customers with diverse service needs, allowing various computing resources and services to be efficiently and securely shared by multiple customers. For example, virtualization technologies may allow a single physical computing machine to be shared among multiple users by providing each user with one or more virtual machines hosted by the single physical computing machine, with each such virtual machine being a software simulation acting as a distinct logical computing system that provides users with the illusion that they are the sole operators and administrators of a given hardware computing resource, while also providing application isolation and security among the various virtual machines. Furthermore, some virtualization technologies are capable of providing virtual resources that span two or more physical resources, such as a single virtual machine with multiple virtual processors that spans multiple distinct physical computing systems. As another example, virtualization technologies may allow data storage hardware to be shared among multiple users by providing each user with a virtualized data store which may be distributed across multiple data storage devices, with each such virtualized data store acting as a distinct logical data store that provides users with the illusion that they are the sole operators and administrators of the data storage resource.
One conventional approach for harnessing these resources to process data is the MapReduce model for distributed, parallel computing. In a MapReduce system, a large data set may be split into smaller chunks, and the smaller chunks may be distributed to multiple computing nodes in a cluster for the initial “map” stage of processing. Multiple nodes may also carry out a second “reduce” stage of processing based on the results of the map stage. In various cluster-based distributed computing systems, including some that implement MapReduce clusters, data to be accessed by compute nodes in a cluster may be stored within the virtualized resource instances of the cluster and/or in data storage systems that are separate from the virtualized resource instances of the cluster. In existing systems that implement MapReduce clusters, capacity may typically only be added or removed manually (e.g., as an individual stand-alone operation) by calling an API of the system, typically through the command-line interface. Therefore, MapReduce clusters are often under- or over-provisioned, resulting in delays (due to under-provisioning) or waste (due to over-provisioning).
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean “including, but not limited to”.