A. Field of the Invention
This invention relates to parallel processing systems in general and specifically to such systems containing rule-based software.
B. Background Information
Systems in which several data processors are linked together to perform computational steps simultaneously are known as parallel processing systems. Parallel processing systems are used to reduce the time required to perform such computational steps. This is very helpful in artificial intelligence applications which perform numerous computations to simulate human cognitive thought. In order to recreate the real-time features of human thought, those computations must be performed at very high speeds.
The basic computations performed by a rule-based artificial intelligence system are searches among "working memory elements" (WMEs). Each WME is a set of related data, such as the age, sex, height, weight, and address of an individual. The WMEs, however, are not necessarily related to each other, nor are they ordered with respect to each other.
The searches are organized according to various "production rules." A "production rule" comprises a set of "conditions" and a set of "operations." The operations are enabled to be performed when the conditions are satisfied. A condition is satisfied when a WME specified by the condition exists, and when the data of the WME meets various criteria specified by the condition. The criteria may be, for example, mathematical (e.g., equality, less than, greater than) or of some other type. When the conditions of a production rule are satisfied, the processing system may perform the "operations" for that rule. Those operations may include causing an output or modifying, creating, or deleting other WMEs.
"Production rules," also referred to as "productions," are usually stored in a memory along with the WMEs. The parallel processors perform in the computational steps specified by the production rules.
Production rules may be "if-then" rules of the type used in connection with OPS5 programs described in OPS5-User Manual, Charles L. Forgy, Technical Report CMU-CS-81-135, Computer Science Department, Carnegie Mellon University, Pittsburgh, Pa., July 1981, which is herein incorporated by reference. When they are stored in memory, each production rule contains (1) an indication that the rule is a production rule, (2) the name of the rule, (3) a "left-hand" or "if" side which specifies the "conditions," and (4) a "right-hand" or "then" side which indicates "operations" to be performed when a rule's conditions have been satisfied.
Often, the conditions specified in a rule are satisfied by more than one set of working memory elements. For example, if the rule had as its conditions (1) a man and (2) a woman, and if the WMEs included data on the residents of an apartment building, then there would likely be several combinations of WMEs, (i.e., pairings of one man and one woman) which would satisfy the conditions of that rule. When this occurs, the operations specified by that rule can be performed according to some priority determined by the WMEs which satisfied the rule. Furthermore, in a case of conflict when all the conditions of several rules have been satisfied, the rule having the highest priority is the one whose operations are subsequently performed.
A "recognize-act cycle" refers to the set of steps wherein (1) conditions of production rules are satisfied by the working memory elements, (2) the satisfied production rule with the highest priority is determined, and (3) the "then" side actions of the rule with highest priority are performed. The recognize-act cycle is typically the basic set of functional steps for a parallel processor system.
An example of a production rule is:
______________________________________ (P PRODNAME (C1 attribute1 value1) (C2 attribute2 value2 attribute3 value3) .fwdarw. (OPER1)). ______________________________________
This rule is exemplary of the production rules used in OPS5 programs. The symbol P indicates that the rule is a production rule (as distinguished from any other rule recognizable by a parallel processing system). The name of the rule is PRODNAME. Left-hand side conditions of the rule are shown with "class" indicators C1 and C2 and specify WMEs.
Attributes, such as attribute1 and attribute2, refer to other data associated with the WMEs which meet the corresponding condition. For example, if a condition was "male," an attribute might be hair color. Value1, value2, and value3 are the values of the corresponding attributes which must be met to satisfy the condition. The attributes and values together thus specify "condition tests," which are the criteria referred to above. The operation, OPER1, is performed when conditions C1 and C2 and the associated conditions tests are satisfied. One or more corresponding WMEs may exist for each condition of an indicated class, specified attributes, and associated values.
Those corresponding WMEs are stored in a memory accessible by the parallel processor for performing the computational steps of production rule PRODNAME. These WMEs may have been stored in the memory during initialization, or they may have been created as a result of an operation performed at the conclusion of a previous recognize-act cycle.
The processor(s) performing the computational steps of a rule access(es) WMEs stored in a memory to identify which of those WMEs satisfy production rule conditions. For the production rule listed above, a processor will evaluate each WME available to it at the end of a previous recognize-act cycle to find any WMEs which satisfy "(C1 attribute1 value1) or (C2 attribute1 value2 attribute2 value3)." If the processor finds at least one WME corresponding to each of the conditions of the production rule, the rule is satisfied and can be "fired."
The number of WMEs which satisfy any given condition of a rule may vary widely for different production rules. The processor(s) performing the processing or computational steps for a rule typically keep(s) track of the WMEs satisfying each condition of the rule. This is necessary because the determination of whether a WME can satisfy a later condition may depend on which WMEs satisfied an earlier condition. A processor typically has processing software to keep track of the order in which each WME was most recently created or modified. One means of keeping track involves "time tags," which identify each WME according to the relative time of most recent creation or modification. Both the information regarding the different WMEs which satisfy a condition and the time tag for WMEs may be used to determine the priority of rules.
Production rule PRODNAME described above, has only two conditions C1 and C2. In general, however, a rule may have many more than two conditions.
One approach to implement this type of rule processing is the Rete Match Algorithm, described in Charles L. Forgy, "Rete: A Fast Algorithm For The Many Pattern/Many Object Pattern Match Problem," Artificial Intelligence, Vol. 19, September, 1982, which is herein incorporated by reference. According to the Rete Match Algorithm, each WME tested and found to satisfy any given condition of a rule, except the first condition, is stored as an "alpha token" in a portion of memory referred to as "alpha memory." Several alpha tokens may exist which correspond to a single satisfied condition of a single rule. For each WME which satisfies the first condition of a rule, a "beta token" is created and stored.
According to the Rete Match Algorithm, for each beta token satisfying a first condition of a rule, an alpha token is sought to satisfy a second condition of that rule. The production rule can only fire if an alpha token satisfying the second condition exists and if a beta token satisfies the first condition. Without also satisfying the second condition, satisfaction of the first condition will not allow the production rule to fire.
If a beta token for the first condition and an alpha token for the second condition exist, then the production rule is partially satisfied as to those two conditions, and the partial satisfaction of the production rule for those conditions can be preserved in the form of another beta token stored in a beta memory corresponding to the combined first and second conditions. In a similar manner, beta tokens are stored for each group of successive conditions, starting with the first condition, that are satisfied. These beta tokens are propagated to successive beta memories for each successive condition each time an alpha memory provides an alpha token satisfying the next condition of a rule. When a beta token is stored in the last beta memory, indicating that every condition of a production rule has been satisfied, the rule is ready for firing.
Since for any condition several alpha tokens satisfying that condition may exist, several combinations of alpha tokens may result in a rule being ready to fire. The specific working memory elements or alpha tokens satisfying a condition may have a bearing on the priority of a rule. Accordingly, satisfaction of each rule is sought with each beta token and each alpha token in order to maintain the highest possible priority for a rule.
Many computational steps are required to determine whether and in how many ways any one rule is satisfied by the sets of beta and alpha tokens. Typical production systems process several thousand production rules causing a great number of computational steps to be performed.
To manage this load, there have been attempts used to distribute the computational tasks across parallel processors and reduce the time required for processing production rules. One approach is known as the "rule-parallelism" approach. According to this approach, each production rule is assigned to only one of the parallel processors on the system. The rules are divided approximately equally among the processors. For example, if there were 8000 rules and 4 parallel processors, each processor is assigned a nonoverlapping subset of 2000 rules.
Each processor independently tests all the working memory elements to see which ones satisfy the conditions of the rules in its subset. At the end of each recognize-act cycle, priority among the fully satisfied rules is determined and the operations specified by the highest priority rule are performed.
Despite the relatively equal division of rules among processors with this approach, rule-parallelism does not ensure equality in the number of tasks to be performed by processors. Certain processors may have rules for which many WMEs satisfy the specified conditions, while other processors may be assigned rules having relatively few condition-satisfying WMEs. Accordingly, when the processors assigned rules with few corresponding WMEs are completed, the other processors will still be processing rules. Thus, there may be task inequality and little reduction in processing time. Furthermore, because different rules may have the same conditions, redundant testing of WMEs will result.
Another approach, known as condition-parallelism, has been implemented to overcome the problems experienced with the rule-parallelism. According to condition-parallelism, only one processor tests WMEs for each condition specified in a rule. Since the difference between WMEs satisfying any two conditions is much less likely to be as large as the difference between WMEs satisfying any two rules, the task inequity from condition-parallelism is less likely to be as great as in rule-parallelism.
Condition-parallelism, however, imposes inter-processor communication overhead. Any given processor which tests WMEs to satisfy a condition must communicate each beta token for the satisfied condition to the processor performing the test for the next condition of a rule. Communication overhead result each time a processor must wait for the beta token associated with a previous condition or retrieve a beta token from a communication link.
Another approach is known as "token-parallelism." This approach is an extension of condition-parallelism and equally distributes tests at junctions or "nodes" between beta memories and alpha memories to determine whether an alpha token exists for an existing beta token. Substantially simultaneous processing is also achieved by this approach, but again at the cost of having to provide greater processor communication overhead. In fact, since the computational steps are divided at an even more basic level than with condition-parallelism, the processor communication overhead is even greater for token-parallelism.
It is, therefore, an objective of the present invention to provide a method for operating a parallel processing system which ensures substantially equal distribution of computational tasks across processors and reduces the time required for processing production rules, regardless of any variation in the number of condition-satisfying working memory elements available for satisfying different rules.
It is another objective of the present invention to provide a processing method which does not experience substantial interprocessor synchronization delays and communication overhead.
It is a further object of the present invention to provide a processing method which minimizes the total amount of data to be communicated between processors during a recognize-act cycle.
It is yet another object of the present invention to provide a processing method which does not require a large shared memory for use by each of the system processors.
Additional objectives and advantages of the present invention will be set forth in part in the description which follows and in part will be obvious from that description or may be learned by practice of the invention. The objectives and advantages of the invention may be realized and obtained by the methods and apparatus particularly pointed out in the appended claims.