1. Technical Field
The invention relates to the organization and viewing of information. More particularly, the invention relates to a methodology for efficiently transforming large or complex decision trees into compact, optimized representations to ease viewing and interaction by a user.
2. Discussion of the Related Art
A decision tree is a structure composed of nodes and links in which the end nodes, also called leaf nodes, represent actions to be taken, the interior nodes represent conditions to be tested on variables, and the branches represent conjunctions of conditions that lead to actions.
A decision tree can represent a series of decisions that divide a population into subsets. The first decision divides the population into two or more segments (i.e. partitions). For each of these segments, a second decision divides the segment into smaller segments. The second decision depends on the choice for the first decision, and the third decision depends on the choice for the second decision, and so on. In other words, a decision tree can be used to segment a dataset in such a way that an action node of the tree corresponds to that segment of the dataset for which all the conditions along the branch from root to that action node are satisfied.
A leveled decision tree is one where the variables appearing along each branch are always in the same order. A decision tree is said to be ‘read once’ when no variable appears twice along any branch.
Depending on the decision process being modeled, a decision tree can be extremely complex, having many variables, many values for those variables, and many outcomes depending on the various combinations of the variables and their associated values.
A sample leveled decision tree is shown in FIG. 1. The tree determines the type of credit card that a bank gives to its credit card applicants on the basis of their job, income and value of assets. In this example, “Job,” “Income” and “Assets” are the variables on which conditions are defined. For example, if a customer satisfies the conditions, “Job=Business AND Income<$100K” AND “Assets>$200K,” the conclusion leaf node reached is “Card=Bronze.”
Information presented in the form of a decision tree becomes difficult to comprehend and visualize when the tree is large. This invention relates to a method for finding an optimal ordering of the variables of the decision tree, and using that ordering to convert the tree to an optimal Directed Acyclic Graph, i.e., a “DAG,” such that the DAG represents the same information found in the tree but with the smallest possible number of nodes compared to any other ordering of variables.
A DAG is a Directed Graph with no cycles or loops. A Directed Graph is a set of nodes and a set of directed edges, also known as arcs or links, connecting the nodes. The edges have arrows indicating directionality of the edge.
Tree representations are comprehensible ‘knowledge structures’ when they are small, but become more and more incomprehensible as they grow in size. Comprehensibility of the knowledge structure is a critical issue since, ultimately, humans must work with and maintain such structures.
The reason why trees often become large is the repeated occurrence of identical subsets of conditions interspersed throughout the tree. This phenomenon is called “sub-tree replication.”
Others have attempted to resolve the problems associated with sub-tree replication. Ron Kohavi, in his research paper, “Bottom-up Induction of Oblivious Read-Once Decision Graphs,” published in the “European Conference on Machine Learning,” 1994, introduced a new representation for a decision tree, called the “Oblivious read-once Decision Graph,” i.e., the “OODG.” He also described a method to convert a tree representation to the OODG representation. However, Kohavi's method chooses the ordering of constituents in an ad hoc manner, which fails to ensure the resulting representation will have the least number of nodes.
Brian R. Gaines, in his research paper “Transforming Rules and Trees into Comprehensible Knowledge Structures,” suggests an alternative representation for general decision trees called the “Exception-based Directed Acyclic Graph,” i.e., an “EDAG.” An “exception” is a clause that has been put alongside a condition, such that, if the condition fails to be satisfied, this clause, called an exception, shall be assumed to be the conclusion for the decision tree. Gaines, however, also fails to address the issue of variable ordering.
Steven J. Friedman and Kenneth J. Supowit, in their research paper, “Finding the Optimal Variable Ordering for Binary Decision Diagrams,” published in “IEEE Transactions on Computers,” Vol. 39, No. 5 in May 1990, discuss a method for finding the optimal variable ordering where the variables of the decision tree are restricted to having only two values, namely true or false. The representation thus restricts the segmentation of a branch into at most two branches by checking on a given condition to be either true or false. This method is intended to be used in the design of electronic circuits only where the outcome of a “decision” is binary. The method cannot be applied directly to a general decision tree where the variables cannot be restricted to the binary values of either true or false.
Given the limitations of the prior art, there exists a need for a method to create an understandable representation of complex decision trees that is both comprehensible by a human user and is computable in practically feasible amounts of time.