The present application is directed to search algorithms, and more particularly to external-memory graph searches.
Searching a graph of combinatorial possibilities is a central technique in areas such as planning, combinatorial optimization, and model checking. For many problems in these areas, a bottleneck in the graph searching process is availability of internal memory of a computer system, such as random access memory (RAM). Therefore, there is currently interest in using external memory, such as disk storage, to improve the scalability of graph-search algorithms. However, it is also known that access to external memory, such as disk storage, is magnitudes slower than to RAM. To limit the number of slow disk input/output (I/O) operations, two particular approaches have been developed to improve the efficiency of graph-search algorithms: delayed duplicate detection (DDD) and structured duplicate detection (SDD).
There are a number of different types of algorithms used to perform graph searching, including Breadth-first search algorithms, uniform-cost search (e.g., Dijkstra's) algorithms, best-first search (e.g., A*) algorithms, among others. These and other related graph-search algorithms store generated nodes in memory in order to be able to detect duplicates and prevent node regeneration. The scalability of these graph-search algorithms can be dramatically increased by storing nodes in external memory, such as disk storage. However, as previously noted, because random access to a disk is several orders of magnitude slower than random access to an internal memory (e.g., RAM), benefits are obtained when external-memory graph search algorithms use duplicate-detection strategies that serialize disk access in a way that minimizes disk I/O, such as by delayed duplicate detection (DDD) and structured duplicate detection (SDD).
Turning to FIG. 1, illustrated is a graph 10, to which duplicate detection (DDD) 12 is applied. In its original and simplest form, delayed duplicate detection (DDD) expands a set of nodes (e.g., the nodes on a search frontier) 14 without checking for duplicates, and stores the generated nodes (including duplicates) in a disk file (or files) 16. The file of nodes is then sorted 18 and duplicates are removed 20. Thereafter, closed nodes are removed 22. In this case, the closed nodes (i.e., the Closed list) are nodes 1, 2, and 3. In keeping with its use by theoretical computer scientists in analyzing the complexity of external-memory graph search, DDD makes no assumptions about the structure of the search graph (except that it is undirected and unweighted). Although in certain special cases DDD may be applicable to directed and weighted graphs (such as the lattice graph of multiple sequence alignment), it requires the graph having a particular kind of structure that many graphs don't have.
Recent work shows that the performance of external memory graph searching can be significantly improved by exploiting the structure of a graph in order to localize memory references. In particular, the structured duplicate detection (SDD) technique exploits local structure captured in an abstract representation of a state-space graph. For graphs with sufficient local structure, structured duplicate detection (SDD) outperforms delayed duplicate detection (DDD) because it never generates duplicates, even temporarily, and thus has lower overhead and reduced complexity. It has also been shown that it is possible to use similar local structure in order to improve the performance of delayed duplicate detection (DDD).
Although approaches that exploit a graph's local structure are effective, they depend on a graph-search problem having the appropriate kind and sufficient amount of local structure in order to be effective. Thus a question arises as to whether these approaches will be equally effective for all graphs, or effective at all for some graphs.