Technical Field
Embodiments presently disclosed herein relate to methods used to improve quality of service in a client-server network. More specifically, embodiments provide a protocol to coordinate throttling across multiple clients sending information to a server.
Description of the Related Art
In client-server architectures, tasks or workloads are partitioned between servers and clients. Server applications provide a function or service to client systems that initiate requests for such services over a computer network. Server applications and client applications exchange messages in a request-response messaging pattern. That is, the client application sends a request, and the server application returns a response, typically including an acknowledgment. To communicate, server and client applications use a communications protocol that establishes a common set of rules so that both the client and server know what to expect.
Generally, a distributed system with multiple front-end client systems may send data for multiple back-end server systems to process. For example, client applications in a social media service may send messages posted on social media servers (e.g., services that allow users to chat with other users, post status updates, or post brief text messages and metadata) from different sources in large groups for server applications to process for data mining purposes. In response, the server applications may store the messages in an inverted index, allowing for full-text searching. A client application sends units of work (e.g., in this case, social media data) to a server for the server to process. After the server application receives the request, the server application sends an acknowledgment to the client application.
However, it is possible for client applications to overwhelm servers with more requests than the server application can handle. As a result, server system resources (e.g., CPU, I/O, memory) become exhausted, weakening the overall system performance. To address this issue, a communication protocol may throttle requests using different approaches. One approach is for client applications to detect when the system performance of a server is experiencing sub-optimal performance and, in response, throttle the sending rate. However, this approach does not ensure fair quality of service across all client systems. For example, one client application may continue sending requests at a normal rate while another client application continues throttling its own requests. Another approach is for the server to detect when the request rate across all clients exceeds a given threshold. When the requests exceed the threshold, the server prevents clients from sending requests until the server frees some resources. However, this approach requires the server to expend resources to receive and understand a request even while the server is currently throttling. Therefore, it is still possible for clients to overwhelm the server with requests.