1. Field
The present disclosure relates to graph search. More specifically, this disclosure relates to a method and system for parallel processing of explicitly represented graphs.
2. Related Art
Parallel graph search lies at the intersection of artificial intelligence, high-performance computing, and more recently, big data analytics. Unlike loosely coupled datasets, graphs typically have a much higher degree of interdependency and cross-coupling among vertices in the same graph. This renders the standard approach to parallelization based on MapReduce ineffective for all but the most trivial graphs, e.g., those that resemble trees. MapReduce is a software framework for processing large data sets in parallel across a distributed cluster of processors or stand-alone computers.
Most real-world graphs, such as social networks, transportation maps, and electric power grids, cannot be well approximated by trees to make them more map-reducible. This presents a significant risk for “big data,” as most algorithms and software packages in this space are implemented on top of open-source MapReduce platform such as Hadoop. The lack of distributed yet scalable graph processing systems and methods can significantly impact the perceived value of various big data applications.
Because vertices of a graph interact in complex and usually unpredictable ways, it is desirable to detect and then eliminate duplicate encodings of the same vertex reached along alternative paths in the graph. But doing so may incur prohibitive communication and/or synchronization overhead among multiple computers or threads competing for the same piece of data, especially since MapReduce is not efficiently applicable to graphs. Approaches based on delaying duplicate detection to avoid excessive communication/synchronization overhead typically need to search a much (worst-case exponentially) larger space than those that catch duplicates immediately.
One example is the simple grid-path finding problem. In a 4-connected grid world, every vertex has 4 successors. Thus, the number of states explored without duplicate detection is 4d, where d is the depth of the search. If the parent of a vertex is never generated as its successor, then 3d is the size of the search space if the graph is treated as a tree. However, there are only O(d2) unique vertices in a 2D grid world. Thus, there can be a huge difference, e.g., O(3d) vs. O(d2), in the size of the search space, if a graph is approximated by a tree. For real-world graphs, the difference is usually bigger because the branching factor, such as the average number of friends on a social network, is typically larger than 3 or 4.
Existing approaches to parallel graph search fall into the following four categories:
1. Approaches designed for implicit graphs. An implicit graph is a graph with vertices and edges determined algorithmically rather than being represented as explicit objects in a computer's memory. Without adaptation, such approaches are not directly applicable to explicitly represented graphs, which cannot be generated on the fly by applying a set of rules, as is done in implicit graph search typically found in planning and scheduling applications. Note that, in contrast to implicit graphs, an explicit graph is one in which every vertex and edge of the graph is represented as an object in one or more data stores or computers' memory.
2. Approaches designed for special-case operations on explicit graphs. They leverage some convenient properties of special graph operations that make the search much easier to parallelize. For example, computing the degree of separation on a social graph amounts to running a breadth-first search starting from the seed vertex. Also, because vertices are explored in breadth-first order, the first time a vertex is discovered, its degree of separation from the seed has to be the same, no matter how many duplicate paths exist between them and how many threads are running in parallel. This makes it unnecessary to perform thread synchronization, because even unsynchronized concurrent writes, which usually result in data corruption, still compute the correct answer in this special case. Such easy-to-parallelize operations are very limited, but once applicable, they can achieve great parallel speed ups without sophisticated graph processing system designs. On the other hand, if it cannot be guaranteed that all threads must write the same value to the same memory cell, then these approaches are not applicable.
3. Approaches designed for general operations on explicit graphs using frequent synchronization. They use synchronization to avoid data corruption under concurrent writes, and they are applicable to general graph operations. Because duplicates are detected as soon as they are generated, these approaches do not greatly expand the search space. However, since synchronization can be time-consuming, their parallel efficiency can be considerably lower than the special-case approaches in category #2.
4. Approaches designed for general operations on explicit graphs using MapReduce. They don't synchronize as often as approaches in the previous category. Instead, they tradeoff immediate duplicate detection for synchronization-free mapping of the graph. Duplicates are only detected and eliminated in the subsequent reduce phase(s), which usually requires synchronization. However, as mentioned earlier, delaying duplicate detection can result in (worst-case exponentially) more vertex expansions, these approaches can suffer from low parallel efficiency and/or high memory requirements (to store the duplicate search vertices).
In summary, approaches in categories 1 and 2 have limited applicability but possibly higher parallel efficiency, whereas those in categories 3 & 4 are broadly applicable but usually suffer from low speed and/or high memory requirements.