Web application traffic management is a critical feature for enterprise application gateways. A gateway cluster can be employed to protect the backend infrastructure from being overwhelmed by an unexpected traffic burst, or by a malicious denial of service (DOS) attack. These days, the new API-based programming paradigm highlights the importance of being able to restrict HTTP-based traffic in a cluster of gateways based on a specific policy. A typical policy could include limits on the number of requests per month and/or per second for a given client to access a backend server. Client requests that violate the policy are buffered (e.g., stored in memory) or rejected. While this technique serves to protect the backend server from being overwhelmed by excess traffic, it can also cause some suboptimal performance in various situations. It is therefore desirable to have improvements in network traffic management in a cluster environment.