Application service providers and Web hosting services that co-host multiple customer sites on the same server cluster or large SMP's are becoming increasingly common in the current Internet infrastructure. The increasing growth of e-commerce on the web means that any server down time that affects the clients being serviced will result in a corresponding loss of revenue. Additionally, the unpredictability of flash crowds can overwhelm a hosting server and bring down multiple customer sites simultaneously, affecting the performance of a large number of clients. It becomes essential, therefore, for hosting services to provide performance isolation, fast recovery times, and continuous operation under overload conditions at least to preferred customers.
Each of the co-hosted customers sites may have different quality-of-service (QoS) goals based on the price of the service and the application requirements. Furthermore, each customer site may require different services during overload based on the client's identity (preferred gold client) and the content they access (e.g., a client with a buy order versus a browsing request). When providing service differentiation during overload it is important to know who the request was from and what it is intended to do. The current techniques of using the incoming connections network header values to differentiate requests is not sufficient. The network headers (IP address and port numbers) only identify the origin client machine and the destination client machine and the receiving application at the destination port. With clients behind a proxy all clients will share the same network and cannot be distinguished. Similarly the type of request that can be determined from the port number as being an FTP transfer vs. an HTTP transfer but cannot distinguish a browse order from a buy order. Current commercial switches and routers use a simple threshold-based request discard policy (e.g., a TCP SYN drop mode) to discard the incoming, oldest or any random connection to control overload. Such techniques may delay or control overload but pay a penalty by discarding a high priority gold customer request instead of an ordinary request. These content-unaware approaches are not adequate as they do not distinguish between the individual QoS requirements. For example, a majority of the load is caused by a few CGI requests and most of the bytes transferred belong to a small set of large files. It has been shown in earlier studies that 90% of the web requests are for 10% of the pages at a site. But 10% of the requests account for 80% of the data transferred.
Consider for example, a news site with a small size main page which is accessed by a majority of the customers. Few customers will download a large audio/video news segment which will cause a high load on the server and the network. If all the small page requests were dropped they would possibly not reduce the server load as much as the single video segment request. This suggests that targeting specific information and client identities (e.g., URIs, types of URIs, cookie information, SSL session ids) for service differentiation can have a wide impact during overload.
One approach to do content-aware service differentiation is to do it within the application or in user space. Clearly when content-based control is performed it requires that the application is modified and is aware of service differentiation functions during overload. This does not achieve application transparency. Modifying legacy applications is difficult. Secondly, control is handed to the application at a much later stage compared to when the operating system kernel processing begins. In such a case low priority requests or requests that will be discarded subsequently will use precious server resources during overload for no useful work. The service differentiation during overload should follow the “early discard” policy. In this case prioritizing a request or deciding to discard or delay a request should be done as soon as the request is received by the kernel. This implies that an ideal location of content-aware service differentiation is within the operating system kernel.