Modern software applications are often developed as one or more independent but interrelated components (or “services”) according to so-called service-oriented architecture (SOA) design principles. Each of the constituent services of a service-oriented software application typically implements a self-contained and discrete unit of functionality and interoperates with other services via defined application programming interfaces (APIs) to carry out the broader functionality of the application formed by the collection of services. The implementation of software applications using SOA design principles can often improve the modularity and resiliency of the applications and can better enable development teams to create, deploy, and scale respective services independently, among other benefits.
The modular nature of the services forming a service-oriented software application, each implementing different types of functionality, often means that each service can be associated with different computing resource needs—for example, some services can be more central processing unit (CPU) and input/output (I/O) intensive, while other services are more graphics processing unit (GPU) intensive, and so forth. Furthermore, the variable nature of workloads processed by some applications can cause different types of computing resources and different services to become constrained at various points in time. Application developers and system administrators of such software applications often have a difficult time determining an appropriate amount of computing resources to devote to each of the various application components and to appropriately scale the provisioned resources as workloads change over time.