Organizations of all sizes are using the World Wide Web (“Web”) for commerce and to improve productivity, market share and internal/external processes. Web sites have become a mission critical necessity in today's business environment. Under such mission critical conditions, unpredictable service levels will result in loss of revenue and market leadership. To avoid this costly impact, web sites must be highly available and dependable.
The Web is comprised of host computers that are contacted by one or several client computers via browsers. A high level protocol (i.e. at the application layer level in the layered OSI network model), Hypertext Transfer Protocol (HTTP), is used to open a connection to an indicated server, send a request, receive a reply and then display the contents of the reply to the user. In response to requests from client computers, host computers transmit data in the form of web pages. The servicing of these requests is typically performed on a “first come first served basis”. Host computer resources are committed to servicing clients' requests, and, as a result there is a finite limit on the number of client computer requests that host computers can simultaneously handle. That is, when host computers receive several requests from many clients the host computers may not be able to service the requests in a timely fashion due to the depletion of the host computer resources. Further disadvantageously, under heavy loads host computers will completely stop accepting new requests until host computer resources are freed up, leaving clients seeking access to heavily used sites in some cases unable to gain access.
It is known that capacity-planning techniques such as over provisioning of computer resources to meet and exceed projected peak demands can alleviate this problem. However, this only increases the capacity of host computers to accept more connections and postpones the above-mentioned behavior. Due to the “first come first served” nature of the HTTP protocol the web site will admit and commit the host computer resources to the next request on the queue. When deployment is a corporate site, it is perfectly conceivable that under peak load this policy of treating every request uniformly will result in applying the corporate resource, such as the web site, inappropriately and indiscriminately resulting in a non-optimal use of the resource in the revenue generating process. On sites where e-commerce is conducted this can translate into a real loss of revenue.
In known networks, Class of service (Cos) is honored in the network infrastructure, i.e. at the physical/hardware level, and is implemented as a means of determining the network bandwidth that is appropriated to the flows based on and commensurate with pre-negotiated policies. Unfortunately, this known Cos policy is terminated at the network layer, below the Application layer, in the layered OSI network model. This means that end-to-end Cos is not presently available in known network configurations because once the flow reaches the application layer (Web layer or HTTP layer), there are no metrics to implement and honor Cos. Consequently, the notion of an end-to-end Cos policy in the context of a policy enabled network, such as an Internet or Intranet, breaks down and any negotiated differentiated service to the client is not universally available. Network based Cos does not deal with back-end server resource allocation, and it does not facilitate differentiation of service on the backend server. Thus even with network level Cos/Qos, all users will uniformly experience degradation of service as a result of back-end server utilization conditions (e.g. bottlenecks, overloads).
Site resource and performance management implementations are known for the intended purpose of improving host availability and dependability in the context of internets and intranets. Known “load balancing” implementations reallocate requests for overloaded servers to available servers. While such implementations monitor server loads and help to avoid disconnects due to overloading, they are typically based on a fixed or static algorithm that effects reallocation. Known load balancing techniques do not provide enhanced quality of service (QOS) as a function of client characteristics, rather QOS is enhanced only by reallocation of load. Load balancing implementations known in the art treat all requests equally, and merely re-direct traffic with little regard for the client generating the request or what the client is requesting or the nature of the transaction. In an ISP environment existing QOS solutions do not permit provision of differentiated services to each virtual site. High-end clients are treated the same as low-end clients. Known load balancing approaches have no mechanism to distinguish or prioritize users or transactions or requests. Thus even with implementation of load balancing there is still a first come first served policy. All requests are treated equally and there is no provisioning of resources based on a user or request.
One known implementation that purportedly provides enhanced QOS as a function of client characteristics is HP WebQos, available from Hewlett-Packard Company, Palo Alto, Calif. HP WebQos enhances web performance, capacity and availability only in the HP-UX operating environment. It permits site managers to prioritize web site service levels allowing higher service quality to be allocated as a function of certain, limited, client characteristics. HP WebQoS monitors service-levels and adjusts workload scheduling and processing rates based on policies configured by the administrator. The HP WebQoS technology prioritizes access as a function of client characteristics by scheduling HTTP requests into three different priority queues. The three queues are static and not further extensible, and control resources allocated to servers and applications. The implementation disadvantageously depends on a particular operating environment, and relies on a proprietary controller (i.e. Cisco LocalDirector), to effect functionality.
The HP WebQos architecture is illustrated in FIG. 1, and comprises essentially four components: a request controller 10, a service resource controller 12, a LocalDirector (LD) controller 14 to manage the proprietary CISCO LocalDirector, and a management system 16. The request controller 10 classifies requests into high, medium, or low priority based on a configured policy. The three priority levels are used to determine admission priority and performance-level. Classification into the three priority levels can be done as a function of: Source IP Address; Destination IP Address; URL; Port number; Hostname; and IP Type-of-service. Initial requests are classified, and thereafter a session is marked and all subsequent requests associated with that session continue to be classified at the same priority.
The request controller 10 controls admission decisions for new sessions, and admits, redirects, defers, or rejects new sessions based on the configured policy. Configurable admission policies are based on the user or service class and on various performance or resource conditions, such as CPU utilization, or length of queues handling the high, medium or low priority http requests. Admitted requests are queued into high, medium, and low priority queues. The queues are serviced based on the configured policy resulting in variation in performance at each level.
The service resource controller 12 manages hardware resource allocation. Greater resources are allocated per unit of workload for higher priority services. The service resource 12 controller controls allocation as a function of: percent CPU utilization and percent disk I/O utilization.
The LD Controller 14 runs on web servers and manages the proprietary Cisco LocalDirector. The LD Controller 14 dynamically manages server weights by setting Cisco LocalDirector's relative weights for each server in a cluster using SNMP_Set. Initial weights for each server are generated using tested throughput results. The LD Controller then dynamically adjusts the LocalDirector weightings to match each server's actual capacity during operation. The LD Controller is loaded on each server using the management system.
The management system 16 in HP WebQos provides a GUI for creating, editing, and deleting Service Level Objectives (SLOs). The SLOs are the capacity and performance objectives configured per user or service class. Admission of requests, resource allocation, and priority queue servicing are configured using the management system to issue directives to the request controller 10 and the service resource controller 12.
Disadvantageously, HP WebQos architecture is highly platform dependent, requiring a particular operating environment, HP-UX, and it relies on a proprietary controller, Cisco LocalDirector, to effect functionality, limiting its applicability to systems including those proprietary components. The implementation is focused on applying policy for differentiated services to categorize users into queues. Flash crowds, i.e. unusually high periodic traffic, requesting services may end up being categorized into one or the other of the queues, thus leading to a queue imbalance. As a result, traffic may end up being rejected in a queue despite availability of server resources to service these requests through another queue. Further, in a web farm configuration, HP WebQos does not permit implementation of class management at the back-end server. Only one load balancing mechanism is used for all back-end servers. The concept of categorizing back-end servers into various priority classes does not exist. The prioritization is limited to the (maximum) three front-end queues. Under circumstances of flash crowds, low priority users will be denied access to the site(s). This may result in long term business consequences, e.g. loss of revenue, and it will result in a negative quality experience for users. The notion of provisioning and reserving resources does not exist in the HP WebQos implementation.
Further, disadvantageously, the HP WebQos does not allow prioritization of traffic or site resources based on a virtual site in an ISP environment. Still further, dependence on the three front-end priority queues, significantly limits the allocation of services among classes by limiting the number of classes of requests as a function of the available queues. Accordingly, HP WebQos imposes a significant limitation on the differentiation of service(s) allocated to admitted requests. The three priority levels are used to determine only admission priority and performance-level, which again imposes a significant limitation on the differentiation of service(s) allocated to admitted requests. A limited number of client characteristics can be used to classify requests with HP WebQos, severely limiting the classification of traffic. In conjunction with the limited number of classification queues, the limited classification characteristics in HP WebQos makes administration of complex policies and rules for classification of requests virtually impossible. The limited classification characteristics in HP WebQos significantly limits client service differentiation and does not facilitate differentiation based on adaptive modeling of client behavior.