The “cloud” is an abstraction that relates to resource management over a network and, more specifically, to an architecture that provides a platform for delivering services through a network. For example, the cloud may refer to various services delivered over the Internet such as network-based storage services or compute services. Typical cloud architecture deployments include a layered hierarchy that includes a physical layer of network hardware, and one or more software layers that enable users to access the network hardware. For example, one common type of cloud architecture deployment includes a physical layer of network resources (e.g., servers, storage device arrays, network switches, etc.) accompanied by a multi-layered hierarchical software framework that includes a first layer that implements Infrastructure as a Service (IaaS), a second layer that implements Platform as a Service (PaaS), and a third layer that implements Software as a Service (SaaS). In general, although there may be exceptions, resources in the third layer are dependent on resources in the second layer, resources in the second layer are dependent on resources in the first layer, and resources in the first layer are dependent on resources in the physical layer.
In conventional cloud architectures, the resources in the physical layer may be allocated to services implemented in the first layer (i.e., IaaS services). For example, a resource manager for the first layer may be configured to allocate resources in the physical layer to different IaaS services running in the first layer. In turn, the resources in the first layer (i.e., IaaS services) may be allocated to services implemented in the second layer (i.e., PaaS services). For example, a resource manager for the second layer may be configured to allocate resources in the first layer to different PaaS services running in the second layer. The resources in the second layer (i.e., PaaS services) may be allocated to services implemented in the third layer (i.e., SaaS services). For example, a resource manager for the third layer may be configured to allocate resources from the second layer to different SaaS services running in the third layer.
Resources in the cloud are partitioned vertically on a first-come, first-served basis where each resource manager only allocates the resources allocated to that resource manager to dependent services corresponding to that resource manager. In addition, the resource pools of the cloud may be partitioned horizontally into different clusters, such as by partitioning the total resources in the physical layer of the cloud into individual clusters partitioned by data center or availability zone. As such, each service implemented in a particular cluster only has access to the resources allocated to that cluster, which may be a subset of the resources included in the cloud.
The resulting allocation of resources in such architectures is typically inefficient. For example, a particular application (i.e., SaaS) in one cluster may have a high resource utilization rate as many users are using the particular application, which is slowed down because the application can only run on the resources allocated to that cluster, but another application in another cluster may have a low resource utilization rate because only a few users are using the particular application. The resource manager in the first level that allocates resources in the physical layer to the two different clusters may not have visibility into the resource utilization rates of different applications running on each cluster and, therefore, the resources of the physical layer may be utilized inefficiently.