The background description provided herein is for the purpose of generally presenting the context of the disclosure. Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Distributed systems are becoming very large and are being served by tens thousands of individual machines. It is extremely unpractical to have homogeneous machines making up this clustered distributed system. In the world of heterogeneous nodes, one node's capabilities can widely vary with that of others in the same cluster either static hardware configurations or dynamic load on the system. Hardware configurations can vary on storage capacity, Storage efficiency, performance, network bandwidth, network performance, number of CPU cores, CPU capabilities, clock rate, amount of memory that are constant for a given machine. Most common implementations tend to distribute work and data across the cluster nodes in a randomized fashion. Treating all nodes equally and uniform distribution can lead to some nodes getting overwhelmed because they have limited disk capacity or have limited CPU/Memory resource or currently it is serving high workload.