1. Technical Field
The present invention relates generally to computer systems, and more specifically to performance monitoring and enhancement tools for use with rule based expert systems.
2. Background Art
Expert systems are computer programs, often run on general purpose computers, which attempt to capture the knowledge of experts in a field. This captured knowledge can then be used by non-experts who, by entering observable data, are able to receive one or more hypotheses as to the cause of abnormal observations or to receive advice in complex decisions. Expert systems typically incorporate data, including facts and relationships, and rules. The databases used by expert systems are often referred to as knowledge bases.
When executing, expert systems use large amounts of CPU resources. Integrating expert system technology into mainstream data processing environments requires significant effort in performance tuning in order to compete on a performance basis with more conventional procedural approaches using third generation programming languages.
The most popular type of expert systems are referred to as production systems. Users write rules consisting of a left-hand-side (LHS) and a right-hand-side (RHS). When the LHS conditions of a rule are met, that rule is fired and the RHS is executed. The RHS changes the state of a working memory which contains all the elements of facts and data used by the expert system. After a rule is fired, the LHS of the rules must again be matched with the new state of working memory.
FIG. 1 shows a high level block diagram of a production system 10. The production system 10 includes a set of rules 12, each having a left-hand-side 14 and a right-hand-side 16. The production system 10 also includes a working memory 18 which contains the facts "known" by the production system 10. A rule interpreter 20, also referred to as an inference engine, matches rule left-hand-sides 14 with working memory 18, and executes right-hand-sides 16.
The rule interpreter 20 operates in an endless loop known as a recognize-act cycle. The recognize-act cycle is shown in FIG. 2. The rule interpreter first performs a match of all rule left-hand-sides with working memory 22. More than one rule may generate a match each against its own set of relevant facts, but only one rule at a time may be fired to handle one of the facts. The rule interpreter 20 selects the rule to be fired, and the fact, using conflict resolution. Conflict resolution algorithms typically select the highest priority rule and the most current fact for firing. Once one of the rules is selected, the corresponding right-hand-side is executed 26, causing a change in working memory. The cycle then repeats, with all of the rule left-hand-sides again being matched to the updated working memory 22.
The art of writing efficient rules programs has not yet been fully developed. General guidelines for efficient rule construction can be found in RULE-BASED PROGRAMMING WITH OPS5, by Thomas Cooper and Nancy Wogrin, Morgan Kaufmann Publishers, Inc., San Mateo, Calif., 1988, and PROGRAMMING EXPERT SYSTEMS IN OPS5, Lee Brownston et al, Addison-Wesley Publishing Company Inc., Reading, Mass., 1985. Other than these two references, there are no sources of expertise available to expert system builders regarding performance tuning of their programs.
The guidelines in the references cited above are rules-of-thumb which are to be applied by the expert systems programmer based on his experience. These rules of thumb are based on a knowledge of the method in which the rule interpreter works. Rule interpreters in available production systems are optimized for efficiency, so that, in general, when a rule RHS is fired, only those LHS's which are directly affected by the changes to working memory are matched on the following match cycle. The rule interpreters limit matching by use of the Rete algorithm, which involves the creation of numerous data structures to store results of matches so that they need not be made again if the relevant working memory elements have not changed. When one or more working memory elements change, the Rete data structure is examined by the rule interpreter to determine which portions of which rules are affected, and performs a match only on those rules.
Because of the nature of the Rete algorithm, large inefficiencies can sometimes be caused in a rules program by small portions of rule left-hand-sides due to the interactions between rules and between rules and data. This is due to large numbers of relevant working memory elements being screened or compared with each other in various combinations. The screening is done by intraelement tests and the comparison by interelement tests. The test specifications are referred to as patterns. The manner in which rules and working memory are structured can make a dramatic difference in the time needed to perform pattern matching.
The rules-of-thumb used by expert system programmers to improve efficiency of a program are general in nature, and not always easily applied. Typical examples of such guidelines are: avoid conditions that match many working memory elements; avoid large cross-products between conditions; avoid frequent changes to matched conditions; make matching individual condition elements faster; and limit the size of the conflict set. Typical solutions for some of these problems include reordering conditions on the left-hand-side so that more restrictive ones occur first and conditions that match frequently changing working memory elements occur last. Expert system programmers must often make intuitive guesses as to where changes should be made, since adequate tools for monitoring and evaluating the performance of rule based expert systems do not currently exist.
Expert system performance is extremely data sensitive. It is rarely possible to evaluate the efficiency of a rules program simply by examining the rules themselves. When a rule is fired, many other rules are involved depending on the current state of the system, the amount of data in working memory, and the firing history of previous rules. The work that needs to be done in pattern matching is not easily predictable in advance. Therefore, there is no universal rule for writing efficient rules in expert system applications.
The benefit of rules programming lies in moving most of the data processing into the LHS, which is compact and declarative. In other words, rule LHS's specify properties of the data without specifying the mechanics of evaluation. Writing rule based applications is simple relative to procedural language approaches, but non-optimized programs can sometimes be very inefficient. Therefore, the cost of optimizing, or tuning, a rules based program must be balanced with the productivity gain of writing a rules program for a complex application. An effective tuning facility that economizes the tuning effort is essential.
It would therefore be desirable to provide a system for collecting data useful to help pinpoint which rules cause the greatest inefficiency during execution of a rules program. It would also be desirable for such a system to assist a user in analyzing his application performance and pinpointing the causes of inefficiencies.