1. Technical Field
The present teaching relates to methods, systems, and programming for work load balancing. Particularly, the present teaching is directed to methods, systems, and programming for work load balancing in a distributed system.
2. Discussion of Technical Background
Distributed computing is a field of computer science that studies distributed systems, which include multiple autonomous computers or parallel virtual machines that communicate through a computer network, such as a computer cluster having multiple nodes. The machines in a distributed system interact with each other in order to achieve a common goal. In distributed computing, a problem is divided into many tasks, each of which is solved by one or more computers, such as the nodes of a computer cluster. Distributed systems and applications may be applied as various paradigms, including grid computing, utility computing, edge computing, and cloud computing by which users may access the server resources using a computer, netbook, tablet, smart phone, or other device through the Internet.
Most distributed systems serving web applications, such as cloud storage and cloud computing systems, behave as dynamic systems with a significant amount of “noise” superimposed on periodic behavior and sudden variations due to garbage collection, scans, etc. In highly scalable and distributed data systems, balancing work load becomes a significant problem because data and query processing must be distributed over existing physical resources. Data storage and processing must also be redistributed as resource configuration changes due to resource optimization and churn events such as physical resource failures, physical resource commissioning, and decommissioning. Finally, application-specific deployment, changes, and processing might give rise to load imbalances, which need to be corrected.
Some known solutions of work load balancing in a distributed system utilize a calculation-based control method. The calculation-based control method is based on statically assigning work assignments to sequenced physical resources. These know solutions, however, lack global registration mechanisms that can monitor an overall work load distribution among the entire system and dynamically balance the work load based on predefined balancing policies.
Moreover, existing controllers of massively scalable distributed systems are too primitive to act as dynamic governors over a large set of possible operational modes. In other words, the existing controllers perform balancing at the extreme edge. However, it is impossible to know the absolutely most optimal decision in some cases, such as when new servers are added in generations, new servers are added incrementally, load shifts are happened in waves, or load exchanges run in parallel. In fact, the most optimal decision may actually reduce the space of possible exchanges of load among serving resources.
Therefore, there is a need to provide a solution for automatically performing dynamic work load balancing in various highly distributed, scalable, and elastic data processing and management systems, which aggregate large sets of physical computers and storage resources.