1. Field of the Invention
This invention relates to backtracking-based searching, and in particular to transparent backtracking-based searching.
2. Background Art
Backtracking-based searching is a mechanism for solving complex problems having many potential alternative solutions. Rather than analyzing all possible alternatives prior to choosing a solution, a system using backtracking investigates one alternative at a time until an acceptable alternative is found.
In the prior art, special purpose logic-programming computer languages have been created that can be used to write a backtracking search program. Special purpose languages cannot take advantage of the tool sets that are typically available for general purpose programming languages. Further it is necessary for a developer to become proficient in the special purpose language (e.g., learn its syntax and programming strategies) before writing or developing a software application using the special purpose language.
One example of a special purpose programming language used for backtrack search applications is Prolog. Prolog operates on rules or clauses. Prolog attempts to match a first clause expressed in the rules with another clause that partially matches the first clause, or a literal statement of data. If multiple matches are found, Prolog chooses one of them. If the chosen alternative fails, Prolog backtracks to the point at which the failed match was chosen, and chooses another alternative.
Prolog has a built-in control flow capability that identifies matches, and arranges for backtracking when alternatives fail. Prolog organizes problems into a tree of xe2x80x9cchoice pointsxe2x80x9d, called a xe2x80x9cdecision treexe2x80x9d. A choice point represents a point at which a decision must be made between a set of alternative selections. Each alternative represents a branch of the decision tree. At each choice point, Prolog chooses a branch, and explores that branch to determine whether it leads to an acceptable solution. If the chosen branch does not produce a solution, Prolog backtracks to the choice point and tries a different branch. As part of the backtracking process, before proceeding down a new branch, Prolog must restore the state of execution that existed at the choice point before the previously tried branch. That is, the data state (i.e. values of variables) and the control state (i.e. execution stack) must be restored to what they were before the previous branch was tried.
Another example of a special purpose language that is used to perform backtracking is described in T. A. Budd, xe2x80x9cBlending Imperative and Relational Programming,xe2x80x9d IEEE Software, January 1991, pp. 58-65. The special purpose language, Leda, includes logic programming such as that provided by Prolog and procedural programming derived from Algol-like languages. Leda is a proprietary language that has its own syntax and programming restrictions. One such restriction requires that data types possess some value that indicates an xe2x80x9cundefinedxe2x80x9d state which indicates that a value has not yet been assigned. All variables are undefined before the first use and cannot be used until defined without raising a runtime error.
The relational portion of Leda is fashioned after Prolog. Leda uses rules as in Prolog. The rules can contain calls to imperative functions. However, the imperative functions must return a Boolean value. Choice points are generated at run time based on the rules or a Suspend statement in Leda. A special compiler is needed to compile the Leda proprietary language.
Since Leda is a proprietary language, it does not provide a solution to the problem of implementing a backtracking mechanism using a general purpose language such as C++, for example. It is not possible to use the wide array of tool sets that are available for the general purpose languages.
L. Nigro, xe2x80x9cControl Extensions in C++xe2x80x9d, Journal of Object-Oriented Programming, Febuary 1994, pp. 37-47 describes control extensions to C++ based on threads. Threads are control objects capable of independent execution. A thread is provided with features allowing temporary suspension and later resumption of its execution. A backtracking class is identified as a control extension of the thread class.
The backtracking class makes a copy of the computational state at a point in execution where a decision is made from among a list of alternatives (e.g., a choice point). In the process of choosing an alternative, a choice point is created and the computational state (i.e., data and execution state) is copied to a duplicate thread.
A failure method of the backtracking class is called to roll the execution state of the current thread back to the most recent choice point and return the next integer (representing the next alternative) in the sequence. To roll back the execution state, the duplicate thread is copied to the current thread.
Thus, in Nigro, the backtracking is performed by storing a copy of the entire execution stack and data state in the duplicate thread so that they can be restored during backtracking. This results in significant overhead due to the processing needed to copy the duplicate thread into the current thread, and due to the memory resources needed to store the duplicate thread. If backtracking is relatively infrequent, this overhead is wasted. Further, to dispose of the execution state of the failed alternative, Nigro""s approach results in additional overhead since it is necessary to copy the duplicate thread back to the current thread. Nigro must perform a copy operation to transfer the archived execution state (i.e., the duplicate thread) to the current thread to dispose of the failed alternative""s execution state.
Nigro""s approach does not address resource allocation and deallocation. The duplicate thread assumes a certain state of resource allocation. When the duplicate thread is transferred to the current thread, unpredictable situations may occur due to a changed state of resource allocation. For example, memory is allocated to store a local string variable after which the duplicate thread is created. The current thread continues and the memory allocated to store a local string variable is released when the method in which the string variable is used terminates. When the current thread is recreated using the duplicate thread, the resource is no longer allocated. However, the restored thread assumes that the memory is still allocated for the local string variable. Use of the local string variable by the restored thread results in reading from unallocated memory which is likely to lead to an invalid execution state or an abnormal termination.
To implement backtracking in a general purpose language, current techniques encode logic for backtracking explicitly in the program which restricts the control flow of the search program. The backtracking logic can be encoded in a program using a recursive search engine that recursively calls itself to perform the search. Each call to itself represents a node, or choice point, on the search""s decision tree.
The following is a pseudocode example of a recursive search routine. The search routine operates with a data state that includes an ordered list of steps or goals each of which must be satisfied for the search to be successful.
boolean search( )
{
If there are no goals remaining, return true;
ask the first goal for a set of alternatives
for each such alternative
{
make the side-effects associated with the alternative
if search( ), return true
unmake the side-effects associated with the alternative
}
return false
}
There are many variations of a recursive technique for writing search programs with general purpose languages. However, each recursive search routine imposes a fundamental restriction on the structure of the search program. The problem domain of the search must necessarily be divided into goals each of which has a set of alternatives and a set of side-effects. As discussed below, one example of a side-effect is a subgoal of a complex goal.
One or more of the goals in the problem domain can be a complex goal which requires substeps at which there might be alternatives. To process a complex goal using a recursive search routine, it is necessary to express the complex goal as a set of substeps (or subgoals that represent each of the substeps). Each of the subgoals must post the next step as a separate goal in the ordered list of goals (e.g., the xe2x80x9cmake the side-effects . . . xe2x80x9d posts the next step as a goal in the ordered list of goals). The recursive search routine that is called to process a subgoal must call itself to process the next subgoal in the ordered list of goals.
For example, a complex goal has subgoals A, B, and C. To process subgoal A of the complex goal, it is necessary for the recursive search routine to post subgoal B and call itself to process subgoal B. Similarly, to process subgoal B, it is necessary for the recursive search routine to post subgoal C and call itself to process subgoal C. As is the case with recursive calls, subgoal A""s search routine call cannot complete (or returns) until subgoal B""s search routine call returns, and subgoal C""s search routine call must return before subgoal B""s search routine call can return. Thus, the processing of each goal is not complete until the processing of each subsequent goal is completed or returns. A subgoal""s alternatives must be exhausted or an alternative identified as a solution for the subgoal before the recursive search routine returns.
The recursive search techniques all share a common trait. The call structure of the overall program must reflect the structure of the decision tree. In the complex goal example, subgoals A, B, and C are choice points in a decision tree that includes a subgoal hierarchy such that A is a parent of B and a grandparent of C. The same hierarchy is reflected in the call sequence, or call structure, that processes subgoals A, B and C. That is, subgoal A""s search routine calls subgoal B""s search routine which calls subgoal C""s search routine.
As discussed above, to backtrack to an alternative, a goal""s recursive search routine cannot return until any subsequent goals"" alternatives are exhausted or a solution is found for the subsequent goals. This is a direct reflection of the fact that the only mechanism available to a recursive search engine for reverting to a point in the call structure is to return to an active routine. While it is possible to perform isolated, independent sub-searches that return before the entire search is completed, it is impossible to backtrack to an untried alternative involved in such a search once the sub-search has completed. Thus, such a sub-search is only appropriate when no future goal could be affected by the manner in which the sub-goal was met.
A technique is used in embodiments of the invention such that backtracking programs (e.g., backtracking search programs) can be written in general purpose computer languages (e.g., C++ or Java) without imposing control flow limitations on the search program. A data state and a control state are restored during backtracking. For restoring the data state, embodiments of the invention keep track of the changes made to variables and the point in execution at which the changes are made.
When backtracking occurs, the data state can be restored by undoing the changes to the desired point in execution. For restoring the control state, the method of the invention provides a xe2x80x9cfailurexe2x80x9d exception state that is invoked upon failure in the program (e.g., a failure to find a solution in a search program). The failure exception is xe2x80x9ccaughtxe2x80x9d by catch points established in the execution stack. The failure exception is passed up the execution stack until a point is reached prior to the failure at which execution can be re-initiated.
A backtracking search program identifies points in the execution at which a decision is made among one or more alternatives. An embodiment of the invention causes a choicePoint to be created at a decision point and identifies the alternatives associated with the decision point. A decision tree of choicePoints is generated that is traversed to identify a search solution. When a failure occurs in the program, a target choicePoint is identified that contains an available alternative (i.e., an untried alternative).
Data state changes are stored as Modifications, an object that contains a method to unmake the modifications made to the data state. Modifications are associated with choicePoints such that a choicePoint""s Modifications can be undone.
A search stack comprises a linear representation of the current branch of the tree of choicePoints. An execution stack comprises the function call/return stack which is used to represent points of execution of the program. The execution stack includes catch points that are capable of catching a failure exception thrown by the program. Each catch point contains a correspondence to a choicePoint in the search stack and an index into the Modifications that hold the values of the variables assigned by the program.
When a failure occurs, a target choicePoint is identified and a failure exception is thrown to revert the data and control states to the target choicePoint. Reversion can be followed by re-execution, if the point at which execution is caught occurs prior to the point of execution associated with the target choicePoint.
Beginning with the current choicePoint, the search stack is traversed backwards to identify the target choicePoint. The target choicePoint is, in general, a choicePoint that still has untried choices. Usually, the most recent such choicePoint is designated as the target choicePoint, but it is possible that a choicePoint has been disabled. By disabling a choicePoint, it is possible to control the search and reduce backtracking. If a choicePoint is disabled, it is ignored as a candidate for the target choicePoint.
If a choicePoint is found not to be the target choicePoint, the Modifications associated with the choicePoint are undone. The search stack is traversed backwards to find the current choicePoint""s previous choicePoint and the search for a target choicePoint continues.
When the failure exception is caught at a catch point, a determination is made whether the catch point""s choicePoint is the target choicePoint or is prior to the target choicePoint. If not, a failure exception is thrown to traverse backwards through the execution stack. When the catch point that is associated or occurs prior to the target choicePoint is found, the target choicePoint""s Modifications are undone. If the catch point occurs prior to the catch point associated with the target choicePoint, execution enters a re-execution mode to re-create the execution stack to reach the target choicePoint""s execution point. An untried alternative is selected from the target choicePoint and normal execution continues.
During re-execution mode, the re-creation process is verified. Instead of re-creating a choicePoint, the choicePoint is compared to an existing choicePoint to verify that the alternatives are the same. Further, Modifications that are identified during the re-execution mode are compared to existing Modifications to verify that the Modifications that occur during re-execution are the same as the original Modifications.
Embodiments of the invention separate a search program into a model and a search engine. The model is a search procedure that is expressed in a programming language. The model includes functionality for generating decision points and causing a decision tree to be created. When the model generates a choicePoint, the search engine chooses the first untried choice and returns it to the model.
The search engine provides functionality for creating choicePoints and managing the backtracking process. The model makes a request of the engine to post the alternatives in a new choicePoint that represents the decision point and identifies the alternatives. The engine creates the choicePoint and inserts it at the end of the current branch in the decision tree. The engine returns the next alternative to the model.
The model continues processing at the current choicePoint by trying the choice returned by the engine. If that choice leads to a solution, the search is successful and ends. However, if an alternative fails, the engine reverts the data and execution state of the model to the target choicePoint or terminates the search if there is no such choicePoint remaining.
The model indicates the failure of a choice by calling a xe2x80x9cfailxe2x80x9d function of the engine. This transfers control to the engine. The fail function throws the failure exception causing the current execution state of the model to be aborted and initiating the backtracking process. When a failure is thrown, control transfers to the engine to revert the data and control states to the target choicePoint. If the point at which the engine backtracks is before the target choicePoint, the model enters re-execution mode.
In re-execution mode, execution by the model is verified to ensure that the re-execution parallels the original execution. A request by the model to record data state changes results in a comparison of the new data state changes with the data state changes stored in modifications. A request by the model to create a choicePoint results in a comparison of the new alternatives and the alternatives of the existing choicePoint. When the execution state reaches the target choicePoint, the engine exits re-execution mode and returns the next alternative. The model continues the search with the next alternative as though it had picked it instead of the failed alternative.
Backtracking is transparent in that the call structure of the program is separate from the decision tree. The model is concerned with generating the decision tree and testing individual alternatives without requiring complex control flow for dealing with failure or testing each alternative in succession. Instead, the model simply calls the fail function when it realizes that the combination of alternatives currently being considered cannot succeed. The engine maintains the ability to restore the call structure that is used by the model to perform the search. The engine includes a mechanism used to loop through the choices available at a choicePoint. When the model calls the fail function, the engine performs the backtracking. The model simply communicates the failure to the engine. During backtracking, the engine returns the control and data states to an appropriate point and supplies the model with the next alternative. The functionality provided by the engine is not incorporated in the model thereby reducing the complexity of the model and allowing the engine""s functionality to be reused with other models.