Web services are adapted to the Web and are capable of bridging any operating system, hardware platform, or programming language by enabling program-to-program communication for direct interaction. Web services provide a layer of abstraction above existing software systems, such as application servers, CORBA (common object request broker architecture), J2EE (Java2 Platform, Enterprise Edition), .NET servers, messaging, and packaged applications. Through web services, applications at various Internet locations can be directly integrated and interconnected as if they were a part of a single, large information technology (IT) system.
Currently, underlying back-end infrastructure for stateful web services is focused on a single deployment model. Namely, a large-scale stateful web service is implemented by a large server hosted within a data center. This system forces the clients to communicate directly with the server in the data center. The system is expensive to create, maintain, and even more importantly, to scale-up. Furthermore, such a system makes it difficult to deal with costs and uncertainties associated with increased demand for capacity in the system.
In contrast, it may be beneficial to employ a system with a large number of low-cost commodity servers, instead of a large server. Such a system can scale up to handle more workload by adding multiple low-cost commodity servers, instead of adding capacity or upgrading a single server or fewer existing servers. These multiple low-cost commodity servers, which may be known as pods, can hold a subset (or in some cases, a partition) of the state of the web service. However, distribution of requests within such a system (so that the request is directed to the pod with the state needed to handle the request) would have to be dealt with in a novel way.
One prior art request distribution system utilizes content embedded in a request to drive load distribution. This technique has been implemented in a number of products, such as HTTP load balancers, XML routers, etc. The load distribution of these products is driven by rules or policies that are configured into the product, and do not leverage an external directory. However, this prior art distribution system is lacking in the following ways:
(1) Limited by the number of rules or policies that can be supported. This limitation introduces two significant restrictions. The first restriction is that the number of partitioned pods for the load distributor to distribute requests is limited because each pod has to be represented by at least one rule or policy. The second restriction is that it limits the choice of attributes that can be used to partition the state of the web service. For example, it would be inappropriate to partition the state of the web service by asset ID (an asset can be a server, a storage device, a software package . . . etc.) if there are millions of assets. A rule or policy would be required for each asset ID, resulting in million of rules or policies, which far exceeds the limited number supported.
(2) Rules and policies are not intended to be changed dynamically. For instance, consider the above example where the rules and policies would have to be updated when assets are added, removed, or when pods are changed. The system is not suitable for frequent dynamic updates and/or changes to the rules and policies. In many cases, the prior art interfaces are proprietary and designed for linear insertion or enumeration of rules or policies. Additionally, these rules and policies require lengthy and resource-intensive compilation into another internal form before taking effect.
(3) Individual product instances have to be independently managed. A large-scale system will have multiple instances of these products deployed in multiple sites in order to support a large-scale geographically distributed web service. Typically, each of these instances will have to be configured using proprietary interfaces when changes to the rules and policies are needed. This can be a management burden, especially if their configurations must be kept up-to-date when there are frequent changes, as well as failures of some instances.
Another prior art distribution systems may utilize session persistence by way of either addresses or cookies. These techniques consistently direct requests associated with a session to a particular server. Typically, an internal hashing technique or associative table is constructed by dynamic learning through observing traffic. However, there is little external control over how distribution is performed, and there is no coordination across instances. Hence, this form of session persistence is inappropriate for request distribution to pods with a subset (or partition) of the web service's state.