The present application is directed to search algorithms and more particularly to graph searches.
There are a number of different types of algorithms used to perform graph searching, including Breadth-first search algorithms, uniform-cost search (e.g., Dijkstra's) algorithms, and best-first search (e.g., A*) algorithms, among others. These and other related graph-search algorithms will store generated nodes in memory in order to be able to detect duplicates and prevent node regeneration. The scalability of these graph-search algorithms can be dramatically increased by storing nodes in external memory, such as disk storage. However, because random access to a disk is several orders of magnitude slower than random access to an internal memory (e.g., RAM), benefits are obtained when external-memory graph search algorithms use duplicate-detection strategies that serialize disk access in a way that minimizes disk input/output (I/O), such as by procedures known as delayed duplicate detection (DDD) and structured duplicate detection (SDD).
Turning to FIG. 1, illustrated is a graph 10, to which delayed duplicate detection (DDD) 12 is applied. In its original and simplest form, delayed duplicate detection (DDD) expands a set of nodes (e.g., the nodes on a search frontier) 14 without checking for duplicates, and stores the generated nodes (including duplicates) in a disk file (or files) 16. The file of nodes is then sorted 18 and duplicates are removed 20. Thereafter, closed nodes are removed 22. In this case, the closed nodes (i.e., the Closed list) are nodes 1, 2, and 3. In keeping with its use by theoretical computer scientists in analyzing the complexity of external-memory graph search, DDD makes no assumptions about the structure of the search graph (except that it is undirected and unweighted). Although in certain special cases DDD may be applicable to directed and weighted graphs (such as a lattice graph of multiple sequence alignment), it requires the graph having a particular kind of structure that many graphs don't have.
Recent work has shown the performance of external memory graph searching can be significantly improved by exploiting the structure of a graph in order to localize memory references. In particular, the structured duplicate detection (SDD) technique exploits local structure captured in an abstract representation of a state-space graph. For graphs with sufficient local structure, structured duplicate detection (SDD) outperforms delayed duplicate detection (DDD) because it never generates duplicates, even temporarily, and thus has lower overhead and reduced complexity. It has also been shown that it is possible to use similar local structure in order to improve the performance of delayed duplicate detection (DDD).
Graph searching is a central problem solving technique in many areas of artificial intelligence (AI), including planning, scheduling, modeling, and combinatorial optimization. Because graph-search algorithms are both computation-intensive and memory-intensive, developing techniques for improving the efficiency and scalability of the graph search is an active and important topic of research. A category of research questions, relates to how to exploit available hardware resources in a graph search. The possibilities include using the previously mentioned external memory, such as a disk, to increase the number of visited nodes that can be stored in order to check for duplicates, as well as using parallel processors, or multiple cores of the same processor, in order to improve search speed.
Parallel graph search is an important research topic in the AI search field, as well as in the high-performance computing community. Most existing approaches make the limiting assumption the search graph is a tree, which lends itself conveniently to parallelization, because the topology of a tree guarantees there is only a unique path from the root to any node in the tree, making it extremely easy to keep only a unique copy of a node in the tree during the search. However, such a simplifying assumption does not hold for many search problems, for which the most natural and economic representation of the search space is a graph. To search a graph efficiently, different ways of reaching a node must be recognized in order to avoid generating any duplicates, which, if not detected, usually slow down the problem-solving process exponentially as the search gets deeper. But in parallel graph search, the traditional method of storing global Open and Closed lists to check for duplicates may incur prohibitive communication and/or synchronization overhead, as efforts must be made to avoid race conditions among multiple processing units. Further, even if the Open and Closed lists can be broken down into smaller pieces and distributed across different processors, significant communication overhead may still occur, if, for example, one processor generates nodes that belong to a different processor.
With regard to parallel search algorithms, it has been pointed out that decreasing the communication coupling between distributed Open lists increases search overhead, and conversely, reducing search overhead using increased communication has the effect of increasing communication overhead. This dilemma is faced by previous approaches to parallel graph search. Although the assumption is often made, for the purpose of parallelization, that a large search problem can be decomposed into a set of smaller ones that are independent from each other, most graph-search problems have sub-problems that interact in complex ways via paths that connect them in a graph. For graphs with many duplicate paths, achieving efficient parallel search remains a challenging and open problem.
Many researchers have recognized that external-memory algorithms and parallel algorithms often exploit similar problem structures to achieve efficiency. This has inspired some recent work on parallelizing graph search using techniques that have proved effective in external-memory graph search, such as delayed duplicate detection (DDD). As mentioned, DDD is an approach to external-memory graph search in which newly-generated nodes are not immediately checked against stored nodes for duplicates; instead, they are written to a file that is processed later, in an Input-output (I/O)-efficient way, to remove duplicates. Based on this idea, some recent approaches have been interested in reducing communication overhead in parallel graph search delay duplicate-detection-induced communication operations so they can be combined later into fewer operations, and performed more efficiently. But delaying communication between multiple processing units can increase search overhead by creating a large number of duplicates that require temporary storage and eventual processing.
Structured duplicate detection (SDD), an alternative approach to external-memory graph search that exploits the structure of a search graph in order to localize memory references, can outperform delayed duplicate detection because it removes duplicates as soon as they are generated, instead of storing them temporarily for later processing, and thus has lower overhead and reduced complexity.
SDD has not been implemented in the area of parallel graph searching. Rather, it appears delayed duplicate detection (DDD) is the primary existing parallelization scheme that attempts to deal with graph structures. However, it has a number of shortcomings. For example, it cannot catch duplicates as soon as they are generated, which leads to less efficient memory usage (due to its storing multiple copies of the same node) and extra overhead when duplicates must be eliminated afterwards.
The present application focuses on improvements in structured duplicate detection (SDD) concepts as they relate to parallel graph searching.