In a horizontally scaled system of the type contemplated herein, an overall software application is realized in a number of peer application instances, each providing full functionality of the application and each representing a portion of an overall application capacity or performance capability. However, existing solutions for managing the application traffic from a pool of clients are based on a number of assumptions that generally do not hold for horizontally scaled systems.
Such operation results from conventional assumptions that traffic control for the pool of clients is performed in a single instance, e.g., the whole system is built up by a single hardware server and that all traffic is routed though a single point, at which point the traffic can be observed and controlled. However, in horizontally scaled systems, hardware and/or software instances can come and go at arbitrary points in time, e.g., due to failures, upgrades, etc.
Perhaps more critically, the distribution of traffic from a pool of clients to peer application instances within a horizontally scaled application may result in some application instances being over-utilized while some application instances are under-utilized. For example, given clients, or at least given connections originating from the same client context, may be “stickier” than others. In this regard, a “sticky” connection is persistent and is associated with continuing application traffic.
It is recognized herein that assigning application traffic incoming from a pool of different clients to respective ones in a round-robin “load distribution” approach does not account for the fact that sticky connections arising from the distributed application traffic may accumulate at one or more of the application instances. Further, synchronizing the state of traffic control parameters among the peer application instances can be costly in regards to available network bandwidth and the number of messages needed to reach and/or maintain a synchronized state.