The present invention relates generally to cloud computing, and more particularly to autoscaling applications in shared cloud resources.
Users of cloud computing systems assume unlimited capacity; however, in reality there exists a finite set of resources. Cloud computing systems attempt to dynamically allocate resources for user applications from this finite set of resources to provide the illusion of an unlimited capacity. Dividing a resource into a finer grained quantity, such as one virtual computer processing unit (vCPU), and assigning these grained quantities dynamically is one typical method used in autoscaling.
Cloud platforms as a service (PaaS) can contain hundreds of applications per virtual machine. As these applications receive traffic, the applications consume resources, and each application may consume different amounts of resources. When the capacity for a resource, such as computer processor unit (CPU) memory, of a virtual machine is saturated, some of the applications may need more resource capacity, while other applications do not.