This invention relates, in general, to distributed data processing, and in particular, to distributed solutions for problems arising from large-scale resource assignment tasks.
Due to emerging technologies in computing and communications, current day machines are able to collect, store, and handle large-scale data. As a result, the size of problems, such as optimization problems, arising, for example, from resource assignment tasks, such as scheduling, load balancing, resource allocation, and pricing, are becoming larger and larger, as well. The problems often become so large that they cannot be stored, let alone solved, on a single compute node.
Therefore, to solve these problems, multiple compute nodes are used. When multiple compute nodes are involved in a solution, however, fault tolerance often becomes an issue, as well as communications overhead and inefficiencies.
Moreover, often the problems are dynamic in nature—the parameters of the problems change over time, e.g., decision variables are added or removed, constraints or objective functions of the problem are added or removed, etc. These changes typically require a restart of the computation, leading to inefficiencies.