Large amounts of information, especially related information, may be organized into network structures. A Bayesian network is a common example of such a network structure. The use of Bayesian networks is increasing in bioinformatics, pattern recognition, statistical computing, etc. The learning of a Bayesian network structure is very computationally intensive, and the solution for finding a true “optimal” structure may be NP-complete and may be impractical or impossible to determine. Even as the learning of Bayesian network structures is very computationally intensive, networks with much larger data sets are being explored, which may increase the computational intensity, and potentially include an exponential increase in computational intensity. Heuristic approaches often focus on improving the performance efficiency of structure learning, for example, decreasing execution time. Performance efficiency is increasingly important in providing acceptable practical solutions to modern networks.
Parallel learning approaches have been considered to include the resources of multiple computation machines and/or processing cores in performing a structure learning algorithm. The parallel nature of these approaches attempts to distribute work among multiple resources to reduce the time any one system spends to find a solution. Traditional parallel learning distributes computation tasks in a basic, naïve manner, which typically considers only numbers of tasks assigned to each parallel computing resource in distributing the computation tasks among the parallel computing resources and fails to consider task complexity.
For example, in a neighbor score computation, a master or control node may distribute a neighbor computation to each of two parallel computing resources, or nodes. A node may check a score cache to determine if a family score is known for the structure. If the score is known (resulting in a cache hit), the computing resource may simply load the score and use it to compute the neighbor score (the score of the directed acyclic graph (DAG), or structure, of interest). If the score is not known (resulting in a cache miss), the computing resource may be required to compute the family score prior to computing the neighbor score. If the first node has a cache hit, its computation time in determining the neighbor score will be much less than the time for the second node to compute the score of its neighbor. Thus, there may be a period where the computing resource with the cache hit and/or the master or control node sit idle (e.g., not performing useful work) while waiting for the second node with the cache miss to complete calculations prior to execution/distribution of more tasks. This results in sequentialization of the parallel execution. Thus, current or traditional parallel approaches to structure learning may fail to provide a desired performance for structure learning for networks of increasing size and complexity, and may fail to provide the tools for proper load balance among the parallel nodes, and scalability of the system.