Efficient communication systems are becoming increasingly important as the demand for communication services increases. Communication services can range from the processing of telephone call setup requests, to the routing of Internet Protocol (IP) data packets over networks, to the processing of Hypertext Transfer Protocol (HTTP) requests for websites and/or content. Communication systems generally include servers to process requests for services from clients. Servers can range from telecommunication switches for processing of telephone call setup requests, to network routers for routing of IP data packets, to web and/or content servers for processing HTTP requests, and the like.
Occasionally, service requests may arrive at a server at a rate faster than the server can process the requests. The rate of the server processing the requests can change due to one or more of the following: variations in processing demands of different requests, background or administrative activities that run on the server, and/or partial or full failure of software or hardware elements in the server. Communication servers typically implement overload controls to maintain the throughput of service request processing at the server at acceptable levels during these periods of high demand.
Some types of servers can experience prolonged overload due to high rates of incoming service requests and/or partial network failures. Overloads can be caused by the following (either individually or in combination): (1) media-stimulated mass-calling events (e.g., tele-votes, charity appeals, competitions, marketing campaigns, and/or the like); (2) emergencies; (3) network equipment failures; and/or (4) auto-scheduled calling (e.g., snow-day cancellation alerts). In the absence of overload control, such overloads can threaten the stability of a communication network, and can cause a severe reduction in successful service completions. Ultimately, server(s) can fail to provide service(s) due to lost requests resulting in the unavailability of services to clients. Often, overload problems can compound themselves, which can cause even more load on a server(s). Furthermore, during overload, the overall capacity of a server(s) can go down, since much of their resources are spent rejecting and/or treating load that they cannot actually process. Under severe overload, the throughput can drop down to a small fraction of the original processing capacity. This is often called congestion collapse. In addition, overload tends to cause service requests to be delayed and/or lost, which can trigger high rates of client abandons and/or reattempts.
Servers can be equipped with some form of adaptive overload detection and control in order to protect against high and unpredictable levels of demand and to keep response times low enough during processing overload to preclude clients from abandoning their service requests prematurely. Some servers implement internal overload control mechanisms, where an overloaded server can reject new requests to maximize successful completion of admitted sessions. Other servers can implement external overload control mechanisms, where servers can control the rate (e.g., set a restriction on the request rate) at which clients can send additional requests for service by communicating control messages to one or more clients.
However, server-implemented internal and external mechanisms as described above (also known as “receiver-based” control mechanisms) can only protect servers against overload to a limited extent, and have difficulties preventing congestion collapse. In particular, receiver-based control mechanisms require the server to maintain and update the restrictions for clients based on the server's internal overload level and then distribute these restrictions via an overload feedback mechanism to clients. Restrictions can include, for example, explicit rate messages, overload window size messages (that limit the number of messages that can be in transit towards the server without being confirmed), and/or loss percentage messages (by which clients should reduce the number of requests they would normally forward to a server). All receiver-based schemes require monitoring the activity of clients to update its distribution list, which can include adding a new client to the server's distribution list when the new client appears, and removing an existing client when that client stops transmitting for a suitable amount of time. Each of these requirements add processing burden to the already overloaded server.
In addition, in explicit rate and overload window schemes, an overloaded server continuously evaluates the amount of load it receives from each upstream neighbor and accordingly assigns a suitable rate cap or overload window size, which should timely be sent back to the transmitting clients to update their restriction levels. Receiver-based schemes that feed back the loss percentage may not impose similar overhead on an overloaded server, because the same loss percentage can be sent to all the upstream neighbors thus dropping the requirement to track the request rate it receives from each client. However, loss percentage schemes may not provide efficient control, because as upstream clients apply the loss percentage on the request rate towards the overloaded server, which can fluctuates quickly, the request rate towards the overloaded server can also fluctuate quickly. These fluctuations require the overloaded server to be able to quickly adapt the throttle percentage to its current load.
Another drawback of receiver-based controls is that they may require changes to the particular protocol stack at the clients and the server(s) in order to implement an overload feedback mechanism. For example, the SIP stack of a server may be required to include a new SIP overload response header or new overload event package. Changes to the protocol stack can slow down the adoption of such controls.