This invention relates to the field of decision systems, and in particular, a converter that can be used to convert decisions trees into probabilistic models, such as Bayesian networks, wherein the probabilistic model is usable to reproduce the original decision tree.
Decision systems are often based on decision trees. A decision tree is a data structure that consists of a root node, inner nodes, and leaves. Each node, including the root node, is connected to one or more other nodes or leaves. Such connections are called branches. Each node also represents a test whose outcome will determine which branch to follow. Branches are directed, i.e., they may only be followed in one direction. To resemble a tree structure, any node or leaf may only have one incoming branch. However, a node may have as many outgoing branches as desired. Finally, each leaf, having one incoming and no outgoing branch, represents a conclusion.
To arrive at a conclusion in a given tree, one begins at the root node and keeps performing tests until one arrives at a leaf. For example, a decision tree may assist an automobile owner or mechanic in finding the cause for a problem. Assuming the problem is that the automobile does not start, the corresponding decision tree might ask for checking the battery in its root node. If the battery is dead, the root node would include a branch leading to a conclusion leaf stating that the battery needs to be recharged or replaced. If the battery is fine, another branch could lead to a next test, for example, to check the starter, the ignition, or the fuel supply, possibly leading to more tests and eventually to a conclusion.
Decision trees have shortcomings as compared to probabilistic models. They are less accurate, less flexible, harder to maintain, and their size grows exponentially with the amount of tests contained in the decision system.
Bayesian networks, which are one example of probabilistic models, are data structures of nodes which are connected by directed links and whose capture of causal dependencies includes probabilistic models. The data structures resemble networks rather than trees, i.e., nodes may have more than one incoming link.
The nodes of a Bayesian network represent observations and conclusions while the directed links express causal dependencies between the conclusions and observations. Bayesian networks can be used to generate decision procedures by means of an inference algorithm. Particularly, an inference algorithm can recommend a path through the nodes and directed links of the Bayesian network which resembles a path through a decision tree. For each step along the way, the inference algorithm provides a ranked list of next steps based on prior observations. The user is expected to choose the top ranking recommendation, but is not limited to it. A lower ranked recommendation can be followed if the user cannot or does not want to follow the top recommendation for some reason.
For example, a Bayesian network representing the automobile trouble decision system described above could recommend to check the battery first, but also offer the lower ranked tests of checking the starter, the ignition, and the fuel supply. The user might for some reason know that the battery cannot be the problem but remember that the spark plugs are overdue and decide to test them first. Then, depending on the outcome, the system would recommend what to do next. Bayesian networks, as well as probabilistic models in general, do not only offer this increased flexibility but are also easier to modify and some classes of probabilistic models only grow linearly in size with the amount of tests contained in the decision system.
Despite the advantages of probabilistic models, decision trees already exist for many problems and experts are familiar with capturing their decision knowledge in the form of a decision tree. For example, manufacturers of automobiles, trucks, military vehicles, locomotives, aircrafts and satellites use decision trees to express diagnostic procedures. One approach for converting the knowledge captured in a decision flowchart or tree into probabilistic models is disclosed in U.S. patent application Ser. No. 10/695,529 entitled “Apparatus, Method, and Computer Program Product for Converting Decision Flowcharts into Decision Probabilistic Graphs,” the entire content of which is incorporated herein by reference. The produced probabilistic model is usable to generate decision sequences, or paths, of optimal convergence, i.e., requiring a minimal number of tests. However, some decision trees are optimized for the frequency of the occurrence of conclusions. Some decision trees are optimized for cost and effectiveness of the tests involved. In such and other cases, it is desirable that the probabilistic model be usable to generate the paths of the original decision tree. The desirability of such probabilistic models is also not limited to optimizations. In essence, a probabilistic model usable to generate the original paths of the original decision tree preserves the information of the original tree, i.e., knowledge that was captured in the decision tree implicitly or explicitly is not lost in the conversion.
Therefore, there is a need for a converter that converts decision trees into probabilistic models, such as Bayesian networks, while preserving the paths of the original decision tree. The present invention provides such a converter.