The Internet has become an increasingly useful tool and means of communication to many people. As the popularity of the Internet has increased, traffic to many Internet service provider (ISP) and application service provider (ASP) sites has become so congested at times that many companies have to impose a limit on the number of users using their sites during peak hours. As a result, a significant loss of business for e-business merchants, user dissatisfaction, and a permanent loss of many potential customers occur. According to at least one source, during the 1999 holiday shopping season, 25 percent of all potential online buyers never completed their online purchases because the e-tail sites of interest had either crashed or were simply too slow. The principle cause of these problems in the case of larger sites was and is an inappropriate distribution of the requests of customers or users (clients) among the sites' resources (servers), namely the multiple content and application servers that are responsible for responding to these requests.
Allocating content and application server resources to respond to a large number of client requests can become rather complex in certain circumstances involving multiple servers at a given site. If it is assumed that there is always at least one server available for each new task that arises, resource assignments may be made in an arbitrary manner, making the resource allocation procedure trivial. To satisfy the assumption underlying this approach to resource allocation, it is generally desirable to create a system design that has abundant resources and strives to conserve them to maintain availability and efficient throughput. In this approach, each client request received at a site is handled as an independent event. U.S. Pat. Nos. 6,173,322, 6,070,191, 5,999,965, and 5,504,894 all describe resource demand distribution schemes that allocate client request among various resources where the client requests are each treated as independent events.
U.S. Pat. No. 6,173,322 is a good example of this approach and describes a system comprised of three host servers each having different request handling capabilities. For illustrative purposes, suppose that hosts H1, H2, and H3 have capabilities C1, C2, and C3 respectively with C3 being the most capable. Further suppose that there are three requests pending, R1, R2, and R3, needing capabilities C1, C2, and C3 respectively. If each request is considered independently and in the order the requests arrive, R1 might be assigned to H3 since this host will serve the request with the least delay. Next, R2 might be assigned to H2 for the same reason. R3 would then suffer if it were assigned to the only remaining host, H1, since H1 is under-powered to handle the request. Alternatively, R3 could wait for H3 to become available. The effect of these kinds of inefficiencies is cumulative; if the same three requests (or their respective equivalents) come in repeatedly and are serviced independently, there will be an ever-diminishing availability of resources until the system saturates and stops responding to new requests. Moreover, Internet demand is not well behaved. Service requests often come in bursts or may back up to form a large backlog for a variety of reasons. As a consequence, it is desirable for the resource allocation procedure to respond in a more sophisticated manner.
Another problem of the request distribution processes described in U.S. Pat. Nos. 6,070,191, 5,999, 965, and 5,504,894 is that these processes consider only parameters related to available resources and do not consider the attributes of the incoming client requests. U.S. Pat. No. 6,173,322 parses certain data contained in incoming clients requests, but only for the purpose of applying a static rule to distribute the requests to one of several server groups. Once this has been done, dynamic resource capability rules are applied to assign the request to a server within the group. These rules may operate in consideration of the static rules previously applied, but only after the static rules are first applied.
While existing schemes for distributing client requests among multiple servers have begun to address some of the problems that arise, it would be desirable to provide a system for distributing client requests across multiple servers that was more efficient and robust. Specifically, it would be advantageous to provide a system for distributing client requests across multiple servers that analyzed the attributes of client requests for expected demand patterns with which resource requirements may be associated, allowing for a comparison of the resource needs of incoming client requests with the resources available, and thus improving the capability of the resource allocation scheme to be more adaptive and dynamic from all operating aspects.