Typically, in the service hosting arena, service providers and customers enter into an agreement that governs the usage of services offered by the cloud platform provider. These agreements are called Service Level Agreements (SLAs). SLAs, in turn, consist of several service level objectives (SLO). SLOB are usually defined based on certain key criteria linked to the service provider, such as, for example, criteria relating to usage of storage or compute resources. With the increasing complexity of cloud platform architectures and their associated management processes, this list of key criteria has grown, and the associated difficulty of identifying and quantifying them has grown in turn.
Along with the growth in computing hardware capability, the number and complexity of web applications has risen, while, at the same time, application hosting has become an increasingly important service offered by cloud platform providers as enterprises realized that it was economical to outsource application hosting activity. Procuring expensive hardware upfront, without knowing the viability of the hosting business, is a significant risk that enterprises were, and are, not willing to take. In essence, by being allowed the flexibility to categorize hardware and application maintenance as non-core activities of their business, enterprises are able to concentrate resources on improving other aspects of their applications, such as user experience and functionality. Accordingly, the level of sophistication required to manage these data centers, or cloud platforms, where numerous applications could be hosted simultaneously increased manifold, along with the cost of maintaining them.
Additionally, it is desirable that service level agreements (SLA) are specific to an application being hosted. In large scale cloud platforms, it is rarely the case that a single SLA is defined for a disparate set of applications that share the same resources. To this end, an overall integrated, framework for deriving a governing SLA and its automated management is desirable. Such an SLA would take into account individual application characteristics while maximizing overall usage of cloud resources under a common set of governing policies.
Effective operationalization of such an SLA framework necessarily includes a mechanism for the management of the service level objectives defined therein, and may consequently include SLA monitoring and controlling mechanisms or policies whereby the SLA is monitored for compliance. Traditionally, load balancing techniques and admission control mechanisms have been used to secure a guaranteed quality of service (QoS) for a given SLA associated with a hosted application.
In one previous approach, load balancing techniques may be used to manage SLO requirements. The objective of such load balancing is to distribute incoming requests related to the hosted application onto a set of physical machines, each hosting a replica of an application, so that the load on the physical machines is equally distributed. Load balancing algorithms may execute on a physical machine that interfaces with a client or a client system. This physical machine, also called the front-end node, receives incoming requests and distributes these requests to different physical machines for further execution. The set of physical machines to which the requests are distributed are responsible for serving incoming requests, and are known as back-end nodes.
However, typically, the one or more algorithms executing on the front-end node are agnostic to the nature of the request. This means that the front-end node is neither aware of the type of client from which the request originates, nor aware of the category, such as browsing, sales, payment etc., to which the request belongs. This category of load balancing algorithms is known as class-agnostic. There is a second category of load balancing algorithms that is known as class-aware. With class-aware load balancing and requests distribution, the front-end node must additionally inspect the type of client making the request or the type of service requested before deciding which back-end node should service the request. However, inspecting a request to find out the class or category of a request is difficult because the client must first establish a connection with a node that is not responsible for servicing the request, i.e. a front-end node.
In another previous approach, admission control mechanisms may be used to manage SLO requirements. Admission control algorithms play an important role in deciding which set of requests should be admitted into the application server when the server experiences very heavy loads. During overload situations, since the response time for all the requests would invariably degrade if all the arriving requests are admitted into the server, it would be preferable to be selective in identifying a subset of requests that should be admitted into the system. The objective of admission control mechanisms, therefore, is to police incoming requests and identify a subset of incoming requests that can be admitted into the system when the system faces overload situations.
A disadvantage with this approach is that a client session may consist of multiple requests that are not necessarily unrelated. Consequently, some requests are rejected. Furthermore, the decision to reject a request can depend on the type of user making the request or the nature of the request being made. For example, a new request or a new session initiated by a high-priority user may be admitted while the requests from low priority users are rejected. Similarly, requests that are likely to consume more system resources can be rejected during overload situations.
Accordingly, there is a need for an effective service level agreement operationalization scheme, that may include means for monitoring and controlling the integrated service level agreement, for an application hosted on a cloud platform whereby the utilization of cloud infrastructural resources with respect to demand is optimized.