Modern distributed computing systems comprise components that are combined to achieve efficient scaling of distributed computing resources, such as distributed data storage resources, distributed networking resources, and/or other resources. Such distributed computing systems have evolved in such a way that incremental linear scaling can be accomplished in many dimensions. The resources in a given distributed system are often grouped into resource subsystems, such as clusters, datacenters, or sites. The resource subsystems can be defined by physical and/or logical boundaries. For example, a cluster might comprise a logically bounded set of nodes associated with a certain department of an enterprise, while a datacenter might be associated with a particular physical geographical location. Modern clusters in a distributed computing system might support over one hundred nodes (or more) that in turn support as many as several thousands (or more) autonomous virtualized entities (VEs) running various workloads. Such VEs might be virtual machines (VMs) and/or executable containers, in hypervisor-assisted virtualization environments and/or in operating system virtualization environments, respectively. The resources and/or consumers of the resources in a distributed computing system are often managed by various software services (e.g., web services, application services, etc.) implemented in the system. The software services can perform tasks such as resource monitoring, resource analysis (e.g., performance, state, health, etc.), resource scheduling (e.g., VE and/or workload creation, modification, migration, deletion, etc.), and/or other tasks. The software services are often implemented as web services having respective service application programming interfaces (APIs) to facilitate communication with one another and/or with a centralized client (e.g., application, web application) to carry out the resource management tasks. For example, a user (e.g., system administrator) might interact with a centralized cluster management application (e.g., client) to monitor and/or schedule resources at the aforementioned cluster having one hundred or more nodes and several thousand or more VEs. In this case, the centralized application might continually communicate with multiple web services across the cluster to carry out the monitoring and/or scheduling.
Unfortunately, such communications between web services often become “chatty”, and/or often include redundant and/or unnecessary re-presentations of message information (e.g., headers, bodies, payloads, etc.). At the same time, application developers and web service developers do not want to rearchitect the web service or their APIs. Nevertheless, even though the application developers and web service developers do not want to rearchitect code, in many distributed computing systems, the number of messages between certain clients and their accessed web services becomes more and more voluminous. Such “chatty” web services can introduce bandwidth consumption (e.g., from message overhead) and/or response latencies (e.g., from sequential message processing) and/or other issues that might result in a degradation of the system performance and/or degradation of the use experience. Some legacy approaches seek to address such issues by merely implementing changes to the web service API to consolidate certain messages. With such legacy approaches, each web service and/or application has to be modified to send, receive, and process such consolidated messages—which modifications demand a significant level of human and financial resources. Worse, with legacy approaches, there is no guarantee that a revised API will be adopted, thus introducing the possibility of inconsistencies and/or conflicts among various versions of the modified web service APIs. What is needed is a technological solution for addressing the performance impact (e.g., bandwidth consumption, response latencies, etc.) of “chatty” web service interactions—yet without modifying the underlying web services.