A. Field of the Invention
This invention relates to data processing. In particular, this invention relates to data mining using a unified neural multi-agent approach.
B. Description of the Related Art
Today, Data Mining has become a "hot" field for business intelligence because, until now, databases were essentially oriented to the input side and had very poor output analysis functionality. With the different mechanisms to extract information from existing data available today, this has lead to different definitions for Data Mining. However, one could offer a synthetic definition of Data Mining as a pattern discovery process from very large databases to provide models for use in decision making. A more precise definition is useful to classify the different "Data Mining" approaches.
Corporations keep accumulating huge amount of data every day, data which may represent their biggest potential assets. However, to realize this wealth, data must be transformed into usable form for the end-user. Decision Support Systems (DSS), including Executive Information Systems (EIS) and Data Access, have been developed to provide a first level of solutions to achieve these transformations.
Data Access and Presentation Tools provide a logical view of databases, generate complex SQL queries and include report writers that can generate charts or maps with "drill down" or other analysis capabilities. EIS tools, which are typically custom or semi-customized software, are heavily programmed to provide "canned" reports for top-level executives, as well as some very advanced statistical analysis functions that can be customized by MIS departments for Executive Reporting.
Today, EIS/DSS tools present two main limitations for fully transforming "data" into "knowledge": (i) users must know in advance exactly what they are searching for and how to search for it, which represents a comparatively low level of discovery; and (ii) results are "static", which means there are no unified mechanisms to provide explicitly predictive capabilities for new cases and validation processes therefore.
Although Data Mining presents an excellent market opportunity, there are two major impediments which have precluded the development of this market until now: (i) lack of quality data, which has until very recently represented a formidable bottleneck; and (ii) limited functionalities and performance of existing tools. None of the technologies used in current Data Mining products (Neural Networks, Rule Induction, Case-Based Reasoning (CBR) and statistical analysis) can provide the combination of prediction, explanation, performance and ease of use necessary for widespread usage of knowledge discovery tools.
Neural Networks constitute one of the better approaches to build predictive models from a set of examples with possibly imperfect data. However, Neural Networks present at least three major limitations: (i) the high level of expertise needed to choose a suitable architecture for the system (the number of layers, the number of neurons per layer, etc.) and to validate such system; (ii) the inability of neural nets to "explain" their results, i.e., they do not provide explicitly predictive models; and (iii) the inability to add explicit knowledge, thus forcing the system to eliminate artifacts or irrelevant relationships or to introduce a priori knowledge. Sometimes, an architecture already established for a specific problem can be used directly to build a predictive model. However, if one changes the problem or even the input data, the architecture can become inadequate. It is then unclear how to define a new architecture. From the knowledge worker's perspective, Neural Networks provide no knowledge discovery since the network provides no "understanding", i.e., the ability to comprehend the significance and reasons therefore; rather, it is a predictive system, but not an explicitly predictive system.
Rule Induction is a pattern discovery technology which does provide explicit results. The knowledge worker can thus understand the discoveries. However, this approach presents a number of limitations as well: (i) it is not adapted to handle complex data because the learning process is time-prohibitive, owing to the large number of combinations between generated rules; and (ii) when the learning is finished, the quantity of generated rules can be very large. The knowledge worker thus has tremendous difficulty to pinpoint relevant discoveries in a sea of rules. When data is imperfect, rules are not particularly well adapted and the predictive model thus not reliable.
Case-Based Reasoning is a technology based on the indexation of all the cases which are presented to the system, as for example, all the court cases on a particular legal issue. When the predictive model is established, it can perform reasoning by analogy. For each new case the system tries to find an old case which is the "closest" by some measure. There are different ways to calculate the distance between the different cases. By contrast with Neural Networks, which does not retain the full data presented to it, every case is maintained in a CBR system. The indexation techniques have no capability for generalization, and there is thus no effective synthesis for the knowledge worker. One can say, therefore, that the "richness" of discoveries is very low. It is relatively easy to build a predictive model, having merely to define the data fields like a desktop database and the system will index the different cases presented. CBR is an interesting approach for building a purely predictive model, like Neural Networks, but it is inadequate to perform knowledge discovery. And since the size of CBR models increase with the number of cases presented, it is difficult to handle very large databases owing to the memory requirements. Moreover, the processing of each new case takes, on average, more time than the previous case because all cases are kept in the system, which exacerbates the database size limitation.
Statistical methods are very well known in data analysis. They are reliable and can provide very rich information. However, the user must often be a trained statistician to choose the right methods and then to interpret the results. The user must have the time and resources to perform the computation of the entire database. Handling non-linear problems, which constitute many normal business problems, require even more advanced skills. Statistical analysis methods favor large populations and tend to discover the more important trends at the expense of the small ones. However, a company might wish to base its competitive advantage on the capability to discover individual particularities, i.e., special niche markets. Finally, statistical analysis tools do not provide an explicitly predictive model. One cannot use a statistical analysis tool to build an automatic prediction system. This lack of explicit predictive capability makes it difficult to validate data analysis results and necessitates the services of an expert statistician where rapid prediction is called for (e.g., credit acceptance).
C. The Neuroagent as Methodology
In AI, three fundamental levels can be distinguished: the knowledge level (knowledge modelling methodology), the symbol level (techniques for knowledge representation such as rules, semantic networks, and frames) and the sub-symbolic level (associative or connectionist technologies). Usually, the neuro-symbolic hybrid systems focus on the integration of the last two levels.
The neuroagent is a neural multi-agent approach based on macro-connectionism and comprises a double integration. The first integration concerns the association and symbol levels, where a neuro-symbolic fully-integrated processing unit provides learning capabilities and distributed inference mechanisms. The second integration concerns the symbol and knowledge levels, using a concept operational modelling (COM) methodology. COM permits the building of generic knowledge models which ensures coherent maintenance of all knowledge models based thereupon.
Knowledge acquisition is often considered to be a bottleneck for the development of expert systems, but two different cases must be distinguished where: (1) the expertise is complex and a modelling phase is necessary to define the conceptual model of the expertise (knowledge level); and (2) the expertise cannot be formalized and learning capabilities are therefore necessary. A prior art graphical knowledge-based system shell called intelliSphere.TM., marketed in the U.S. by DataMind, Inc., Redwood City, Calif., is based on the neuroagent approach, and has allowed the development of solutions to different industrial problems (e.g., image processing, design by optimization and constraints satisfaction, and medical diagnosis) where classical approaches have presented significant limits for modelling or learning.
Usually, the basic functional unit in an artificial neural network corresponds to a single formal neuron. FIG. 1, described below, shows how the structure of the neuroagent is related to the neurobiological macro-connectionist level, i.e., the basic unit at this level is an assembly or network of neurons, not a single neuron.
From the computer science point of view, the common characteristics between the connectionist and the macro-connectionist levels are: (i) an automata network; (ii) each automaton has a transition function and a memory; (iii) a communication medium based on the numeric propagation; (iv) numeric learning capabilities; (v) associative memory properties; and (vi) the triggering of the automaton's behavior based on the numeric stimulation that the automaton has received.
In comparison to the connectionist level, the macro-connectionist level differs in the following respects: (i) the automaton can be complex; (ii) the automaton is functionally autonomous; (iii) the propagation mechanism is more complex; and (iv) the topology of the network presents a semantic organization.
These latter characteristics are implemented in the neuroagent approach. The neuroagent is both an analysis and modelling entity for the knowledge level and an implemented neuro-symbolic processing unit for the development of knowledge-based systems.
1. Neuroagent Structure.
FIG. 1 shows the structure of a neuroagent 100 which consists of three main elements: (i) the Communication and Activation Envelope 110 which ensures a standard communication between neuroagents, and which controls the neuroagent's state; (ii) the Nominal Zone 120 which describes the neuroagent (e.g., name, synonyms); and (iii) the Internal Process 130 which determines the neuroagent's functional specifications.
a. Communication and Activation Envelope.
The Communication and Activation Envelope 110 contains: (i) the Minimal Excitation Zone 112; and (ii) the Contextual Excitation Zone 114. These excitation zones 112, 114 are areas which will receive the connections 130 from other neuroagents. It is clear that the Activation and Communication Envelope 110 can have the following configurations depending on the connections of its excitation zones 112 and 144: (i) no connection on either of the excitation zones 112 and 144; (ii) connections only on the Minimal Excitation Zone 112; (iii) connections only on the Contextual Excitation Zone 114; and (iv) connections on both the Minimal 112 and the Contextual 144 Excitation Zones.
A neuroexpression is a logic expression of neuroagents. In the embodiment described here, only logical AND's ("&") and NOT's ("!") are permitted as logical connectors within a neuroexpression; OR's (".vertline.") have been dispensed with because their logical function can be satisfied simply by having other neuroexpressions connected in parallel. An "atomic" neuroexpression is one that corresponds to a single neuroagent. With reference now to FIG. 2, one can see in neuroagent network 200 the connections established between neuroexpressions E.sub.1 230, E.sub.2 240, E.sub.3 250 and a given output neuroagent Z 210. In this example, the neuroexpressions E.sub.1 230, E.sub.2 240, E.sub.3 250 consist of two neuroagents each, B 232 and C 234, F 242 and D 244, B 232 and A 220, respectively. One can observe all the neuroexpressions E.sub.1 230, E.sub.2 240, E.sub.3 250 are connected (through 255) to neuroagent Z's 210 Contextual Excitation Zone Z 210b, while neuroagent A 220 is connected (through 225) to it's Minimal Excitation Zone 210a. Note that there is nothing which prohibits neuroagents B 232 and A 220 being represented more than once in neuroagent network 200. An output neuroagent may even form part of the input to itself.
Referring once more to FIG. 1, the Minimal Excitation Zone 112 of neuroagent 100 is the zone where all connections present must be validated, i.e., they are necessary conditions. However, having a necessary condition does not mean it is sufficient; rather, it is a minimal condition. For example, the concept "WHEEL" is absolutely necessary for the validation of the concept "CAR", but it is not sufficient. However, it is possible to express a minimal and sufficient condition. Recall that in FIG. 2 neuroagent A 220 is connected to both the Minimal 210a and Contextual 210b Excitation Zones.
The Contextual Excitation Zone 114 of neuroagent 100 is the zone where one defines what "evokes" and what "rejects" the neuroagent 100, or equivalently, what validates or inhibits the neuroagent. Using connection weights, as for example here, w.sub.ij 132, w.sub.ik 134, w.sub.im 136, it is possible to grade evocations and rejections. The evocation is a positive real number that can express a relatively weak (i.e., 20-40), or strong evocation (i.e., 80-100). The rejection is a negative real number that can express a relatively weak (i.e., (-40)-(-20)) or strong rejection (i.e., (-100)-(-80). The connection weights, w.sub.ij 132, w.sub.ik 134, w.sub.im 136, are, by convention, expressed as percentages.
There are two ways to establish the connection weights: (i) explicitly (e.g., with a fixed number of broad evocation levels: very weakly positive 20, weakly positive 40, positive 60, strongly positive 80, very strongly positive 100; similarly, for rejection: very weakly negative -20, weakly negative -40, negative -60, strongly negative -80, very strongly negative -100); or (ii) by learning from examples.
Contrary to ordinary Neural Networks, the connection weights can be dynamically modulated according to the current context, during the propagation process. This can be accomplished by modulating a connection weight by means of the Modulation Coefficient or by the Stimulation Function, discussed below.
The introduction of Modulation Coeffients and Stimulation Functions means that the defined connection weight, the effective connection weight and the stimulation weight may thus be all different. This notion of Modulation Coefficient is important because, as shall be seen, it allows external numerical functions to be directly integrated inside the system. The Modulation Coefficient may be a numerical function (fuzzy function, statistical function, or any numerical function). By default, the stimulation function is linear, but can be sigmoidal, for example. A sigmoidal choice, for example, tends to force evocation or rejection results.
b. The Nominal Zone.
The Nominal Zone 120 contains the neuroagent's 100: (i) label, the neuroagent's main identification used to designate it; and (ii) synonyms, used to assign other names to a neuroagent.
c. Internal Process.
There are two types of actions that can be taken by the Internal Process 310 of a neuroagent 300, as shown in FIG. 3: Cognitive actions 330, 340 and Productive actions 320.
Cognitive actions 330, 340 have a direct effect on the inference strategies. For example, a function 330 can be used to validate, inhibit or activate other neuroagents as well as to dynamically modulate the connection weights or functions used to express fuzzy predicates. Also, an Internal Process 310 can be used to embed a network of neuroagents 340 within a neuroagent 300.
Productive actions 320 are programmed functions (in "C" for example) which correspond to external processing (e.g., operate a video camera, execute an SQL request, make a library function call) or even the encapsulation of another entire neuroagent-based application.
It should be noted that a neuroagent lacking an Internal Process is not simply a neuron in the ordinary Neural Networks sense. For example, the neuroagent has a Minimal Excitation Zone, and such neuroagents can be built up into a neuroexpression with other neuroagents.
2. The Neuroagent's Basic Behavior.
A neuroagent's basic behavior is determined by the behavior parameters, the neuroagent's state and the inference propagation mechanisms.
a. The Behavior Parameters.
The neuroagent's inferential behavior is conditioned through various behavior parameters, among them: (i) the Excitation Threshold; (ii) the Excitation Level; and (iii) the Modulation Coefficient.
These behavior parameters shall be explained, in part, with reference to FIG. 4, where a neuroagent j 420 is acted upon by neuroexpressions, such as E.sub.i 430 and E.sub.r 440 (consisting of neuroagents k1 432, k2 434 and r 442, respectively) through the Contextual Excitation Zone 424, and by neuroagent (equivalently, atomic neuroexpression) 410 through the Minimal Excitation Zone 422.
(1) Excitation Threshold.
The Excitation Threshold, et.sub.j, sets the validation threshold for the Contextual Excitation Zone 424. The default setting for the Excitation Threshold parameter is 100%. In other words, if the sum of all stimulations exceed 100%, the Contextual Excitation Zone 424 is considered validated.
(2) Excitation Level.
The Excitation Level allows the determination of the state of the Contextual Excitation Zone 424. In the embodiment discussed here, the Excitation Level el.sub.j of neuroagent j 420 is established as: ##EQU1## where f is a "stimulation" function. This function is linear by default, but clearly it is possible to choose other functions, a double sigmoidal function for example. The expression S.sub.Eij 438 (438') is the stimulation of neuroexpression E.sub.i 430 with respect to neuroagent j 420, and eval.sub.-- exp(E.sub.i) is the logic evaluation function for the neuroexpression E.sub.i 430. Thus, in this example, eval.sub.-- exp(E.sub.i) is true if and only if component neuroagents k1 432 and k2 434 are true, i.e., that they are both validated. Similarly, eval.sub.-- exp(E.sub.r) is true identically with neuroagent r 442 being validated. The default setting for the Excitation Level parameter is 0%.
(3) Modulation Coefficient.
The Modulation Coefficient is used to dynamically modulate the neuroagent's 420 connection weights. Indeed, the connection weight W.sub.Eij 436 is a fixed value determined by a training expert or by learning from examples. But, neuroagent j 420 will receive a stimulation S.sub.Eij 438 (438') from neuroexpression E.sub.i 430, if the latter is validated, rather than the connection weight W.sub.Eij 436. An exemplary formula for calculating the stimulation S.sub.Eij 438 (438') would be: EQU S.sub.E.sbsb.ij W.sub.E.sbsb.ij .times.mc.sub.E.sbsb.i (Eq. 2)
with mc.sub.Ei the Modulation Coefficient of the neuroexpression E.sub.i 430. The Modulation Coefficients of individual neuroagents may be established by numerical functions, based on fuzzy predicates or statistics, for example. Furthermore, one can propose various ways of compounding the Modulation Coefficient of a neuroexpression based on the Modulation Coefficients of the component neuroagents (e.g., average). In this embodiment, however, the compounded Modulation Coefficient mc.sub.Ei is taken to be: ##EQU2## where {mc.sub.h } are the Modulation Coefficients of all neuroagents h (say 432 or 434) which belong to the neuroexpression E.sub.i 430. In this context, each of the modulation coefficients mc,, are evaluated with respect to the reference data point. The default setting for the Modulation Coefficient parameter is 100%.
(a) Fuzzy Predicates.
Owing to the Modulation Coefficient, neuroagents can be used to implement some features of fuzzy associative memories, which can be illustrated with FIGS. 5(a)-(d) in the context of a traffic problem. With reference to FIG. 5(a), the fuzzy membership set 900 for traffic density is shown, consisting of light traffic distribution 902, medium distribution 904 and heavy distribution 906. From the fuzzy membership defined by set 900, one can take the fuzzy primitive "is.sub.-- from", mc.sub.h 922, shown in FIG. 5(c) which represents the heavy traffic distribution. Similarly, one could have chosen a fuzzy primitive "is.sub.-- between", mc.sub.p 912, shown in FIG. 5(b) taken from another fuzzy membership set (not shown) related to the peak traffic period. With these fuzzy primitives 912, 922, one can apply them to the example network 950 shown in FIG. 5(d) with the aim of predicting when to deviate from one's normal path to avoid traffic. The network 950 is composed of neuroexpressions E.sub.i 960 and E.sub.i 970 connected to output neuroagent "Deviations" j 980. Neuroexpression E.sub.i 960 is composed of the neuroagents "Peak Period" 962 and "Heavy Traffic" 964, while atomic neuroexpression E.sub.r 970 is composed simply of neuroagent "Road Works" 972. Thus, applying fuzzy primitives mc.sub.p 912 and mc.sub.h 922, one can arrive at the "compound" modulation coefficient, mc.sub.Ei 990, as, for example, from (Eq. 3). Thus, the closer the hour is to peak period, i.e., between the hours of 6 to 8 oclock, as well as the higher the traffic density is with respect to 200 cars, the more likely that neuroexpression E.sub.i 960 will validate "Deviations" j 980. Of course, the presence of road works will also validate "Deviations" j 980 in the conventional manner. Thus, fuzzy predicates can be freely mixed with deterministic predicates.
b. The Neuroagent's State Determination.
At any given time, each neuroagent is characterized by its state which is established as the result of all the stimulation it receives on its Activation and Communication Envelope.
(1) Defining the Neuroagent State.
Three states are possible for a neuroagent: (i) undetermined; (ii) validated; or (iii) inhibited. Table I summarizes all of the neuroagent's states.
TABLE I ______________________________________ Neuroagent States Minimal Excitation Contextual Neuroagent Zone Excitation Zone General State ______________________________________ Validated No connection Validated No connection Validated Validated Validated Validated Validated Inhibited No connection Inhibited No connection Inhibited Inhibited Inhibited Any Inhibited Any Inhibited Inhibited Undetermined No connection Undetermined No connection Undetermined Undetermined Undetermined Undetermined Undetermined Undetermined Validated Undetermined Validated Undetermined Undetermined ______________________________________
The default state for a neuroagent is undetermined.
(2) Defining the State of the Minimal Excitation Zone.
Again, with reference to FIG. 4, the Minimal Excitation Zone 422 is validated when the neuroexpressions (here, neuroagent 410) connected to this zone, i.e., the logic conditions described with neuroagents, are validated. This zone 422 is inhibited otherwise. By convention, if there are no connections on the Minimal Excitation Zone 422 then it is considered to be validated. The evaluation of the Minimal Excitation Zone 422 of the neuroagent j 420, mez.sub.j, is thus formulated as follows:
with nb.sub.-- mez.sub.j the number of connections on the Minimal Excitation Zone 422 of the neuroagent j 420, eval.sub.-- mez (j) the logic evaluation function of the neuroexpressions ##EQU3## 410 connected on the Minimal Excitation Zone 422 of neuroagent j 420.
(3) Defining the State of the Contextual Excitation Zone.
The Contextual Excitation Zone 424 is validated when the sum of positive and negative stimulations 438 (438') meets or exceeds the neuroagent's Excitation Threshold. Similarly, the Contextual Excitation Zone 424 is inhibited when the sum of the positive and negative stimulations 438 (438') is less than or equal to the negative value of the Excitation Threshold. Between these two limits, the Contextual Excitation Zone 424 is indeterminate. As the default value of the Excitation Threshold 424 is equal to 100%, this means that the Contextual Excitation Zone 424 inhibited when the sum of the positive and negative stimulation 438 (438') is less than or equal to -100%.
By convention, if there are no connections on the Contextual Excitation Zone 424 then it is considered as validated. The evaluation of the Contextual Excitation Zone 424 of the neuroagent j, cez.sub.j is thus formulated as ##EQU4## with nb.sub.-- cez.sub.j the number of connections on the Contextual Excitation Zone 424, el.sub.j the Excitation Level of the neuroagent j 420, and et.sub.j the Excitation Threshold of the neuroagent j 420. The Excitation Level, el.sub.j, would be given by (Eq. 1).
c. The Inference Propagation Mechanism.
The propagation mechanisms incorporated into various embodiments of neuroagents include: (i) forward propagation; (ii) backward propagation; (iii) spontaneous backward propagation; and (iv) "retropropagation of necessities."All these propagation mechanisms are asynchronous, meaning that the update of neuroagents are event-driven.
FIGS. 6(a)-(c) show the different inference propagation mechanisms within the same system. In FIG. 6(a), forward propagation, the usual mode of propagation, is shown. Neuroagent 600 is connected to neuroagent 610 via the latter's Minimal Excitation Zone 614 or Contextual Excitation Zone 612 (connections 604 or 602, respectively). Thus, when neuroagent 600 is validated, this state is propagated, as signals 608 or 606, as the case may be, to neuroagent 610. Thus, if the minimal conditions on neuroagent 610 are satisfied and/or the excitation threshold reached, neuroagent 610 itself may be validated and the propagation may continue further.
FIG. 6(b) shows backward propagation. Notice that the connections are analogous to FIG. 6(a), with neuroagent 620 connected to neuroagent 630 via the latter's Minimal Excitation Zone 634 or Contextual Excitation Zone 632 (connections 624 or 622, respectively). Backward propagation may be performed as a result of an explicit selection to backward propagate, or may occur spontaneously, through the mechanism of Hypothesis. The Hypothesis mechanism triggers backward propagation where neuroagent validation is almost present, as for example: (i) the Minimal Excitation Zone 634 is validated and the Contextual Excitation Zone 632, is near but below its excitation threshold, say in the range of 80-100%; or (ii) Contextual
Excitation Zone 632 is validated but the Minimal Excitation Zone 634 is indeterminate. Thus, depending on the mechanism, neuroagent 630 will backward propagate, as either signals 636 or 638 (under Hypothesis, owing to the spontaneous generation by the Contextual 632 or Minimal 634 Excitation Zones, respectively), to neuroagent 620. Thus, neuroagent 620, based on this backward propagation, will find itself either validated, inhibited or indeterminate. The indeterminate state may cause further spontaneous backward propagation, or the process will stop if neuroagent 620 is not configured to go into Hypothesis.
Retropropagation of the necessities involves only the Minimal Excitation Zone, and is a means to verify implicit deductions, as shown in FIG. 6(c). Here, neuroagent 650 is connected to neuroagent 640 through the latter's Minimal Excitation Zone (connection 652). Neuroagent 650 may be connected to other neuroagents (not shown) through connection 654. Thus, if neuroagent 640 is validated (signal 642) retropropagation will occur (signal 644), thereby validating neuroagent 650, which will forward propagate itself (signal 656) as will neuroagent 640 (via connection 646). The implicit deductions are thus verified in the sense that the network connection topology supplies the information. Say that neuroagent 640 represents "CAR" and neuroagent 650 "WHEELS". Thus, this connection of neuroagents 640, 650 would supply the deduction that "CARS" implies "WHEELS".
3. Construction of Neuroagent Networks.
With the neuroagent approach, it is possible to design a knowledge base through either explicit modelling, learning, or both. This versatility enhances the quality of the knowledge bases, since in many cases neither explicit modelling nor learning from examples are sufficient of themselves.
The learning process is conducted with two objectives: (i) to automatically establish the connection weights as in usual connectionist models; and (ii) to automatically establish the topology of the network. Due to the neuroagent's connectionist architecture, the system will not be a "black box" at the end of the learning; rather, it will be able to reach semantic conclusions, i.e., make explicit predictions as to: minimal conditions for the validation of outputs, the simultaneous presence of certain inputs, and the specificity of certain inputs, etc.
Say one had a medical database to study in order to design a knowledge base of pathologies diagnosed in various patients. At the end of the learning process, one would obtain the connection weights established by the system, but also: (i) which symptoms are minimal (necessarily present) in order to diagnose a given pathology; (ii) which symptoms are always found together; and (iii) which symptoms are specific to a given pathology.
The neuroagent architecture has the following characteristics: (i) the topology of the neuroagent network is built during the learning period; (ii) the input parameters can be qualitative and/or quantitative; (iii) input parameters can be missing; (iv) the neuroexpressions are built during the learning period; (v) learning can be mixed directly with explicit knowledge; (vi) the order in which the examples are presented has no importance; and (vii) the order of the input data included in the examples has no importance.
The learning algorithm used with neuroagents is similar to that used with Probabilistic Neural Networks (PNN). However, it presents a number of differences, such as the advent of neuroexpressions, excitation zones, and the topology building associated therewith. FIGS. 7(a)-(b) shows how the topology of a neuroagent is built up during learning. Consider, in FIG. 7(a), how input neuroagents S1 710, S2 720, S3 730 are interconnected via 712, 722, 732, respectively, to output neuroagent P1 700 in view of training data 701. With only one set of training data 701, consisting of input S1 710, S2 720, S3 730 and output P1 700, the strongest assumption compatible with such data is to form a neuroexpression 705 comprised of neuroagents S1 710, S2 720 and S3 730, and connected via 706 to the Minimal Excitation Zone 702 of output neuroagent 700. In other words, without other information, the inputs all are assumed to occur together and be a minimal condition. There are thus no connections to the Contextual Excitation Zone 704 at this stage.
With a second set of training data, this situation changes, as shown in FIG. 7(b). Training data 701', consisting of inputs S1 710', S2 720', S4 740', S5 750' for output P1 700', causes a revision of the prior assumptions. The neuroexpression 705 must be broken down into the smaller neuroexpression 705', consisting of neuroagents S1 710' and S2 720', and the lone neuroagent 730' because the latter was not present in training data 701'. A new neuroexpression 707' is created and composed of neuroagents S4 740' and S5 750', because there is nothing yet to break the assumption that these neuroagents occur together. As S3 730' can no longer be assumed as "minimal", the connection 703' to output neuroagent P1 700' is made through the Contextual Excitation Zone 704'. Similarly, neuroexpression 707' was not "minimal" in the first instance, so it too, is connected via 705' to Contextual Excitation Zone 704'. The connection 706' of residual neuroexpression 705' to the Minimal Excitation Zone 702' remains. It is clear that this process could be repeated with still more training data.
During the learning period, the parameters which are necessary for the calculation of the connection weights are established. With N(E.sub.1 .vertline.O.sub.j) being the number of examples where the neuroexpression E.sub.i was present when the output was O.sub.j, N(E.sub.i) the total number of examples where neuroexpression E.sub.i was present, N(n.sub.k) the number of examples where the neuroagent n.sub.k was present, N(O.sub.j) the number of examples where the output was O.sub.j, N the total number of examples which were presented during the learning period, and Nb.sub.-- Class the number of output classes which were presented during the learning period, one can discuss the construction of the connection weights. The evaluation of the connection weight, or impact, w, of E.sub.i on the various outputs O.sub.j is based on a comparative process. In the embodiment discussed here, the connection weight w(E.sub.i,O.sub.j) is defined as P(O.sub.j .vertline.E.sub.i), the probability that output neuroagent O.sub.j is validated (i.e., the corresponding outcome is present) when neuroexpression E.sub.i is true. Notice that P(O.sub.j .vertline.E.sub.i) can be expressed in terms of the converse probabilities, P(E.sub.i .vertline.O.sub.k), namely, the probabilities that E.sub.i is true when any of the output neuroagents {O.sub.k } are validated, in this manner: ##EQU5## A simplifying approximation, in that one is concerned here only with neuroexpression E.sub.i and output O.sub.j, is that all other probabilities other than P(E.sub.i .vertline.O.sub.j), representing the other (Nb.sub.-- Class-1) output classes, are equal and denoted by the expression P(E.sub.i .vertline.O.sub.j) . Under this approximation: ##EQU6## where P(E.sub.i .vertline.O.sub.j) can be expressed in this manner: ##EQU7## Recalling that N(E.sub.i) is the total number of occurrences of neuroexpression Ei, this number certainly cannot be greater than the occurrence of any of its composite neuroagents, N(n.sub.k), and so offers the familiar approximation: ##EQU8## Finally, P(E.sub.i .vertline.O.sub.j) itself can be expressed readily as: ##EQU9## Thus, back substitution of (Eq. 8), (Eq. 9) and (Eq. 10) into (Eq. 7) would yield an approximation to P(O.sub.j .vertline.E.sub.i) as: ##EQU10## However, to avoid taking irrelevant information into account, a Significant Threshold, ST, may be introduced as follows: ##EQU11## where SD (for standard deviation) and PT (parasite threshold) are arbitrary constants though preferably small. Thus, the final value for the connection weight is calculated as follows: ##EQU12##
The basic neuroagent technology, and the prior art IntelliSphere.TM. product which embodied it, presents a formidable learning curve owing to its novel methodology. What is needed, therefore, is an application which can both hide the underlying mechanisms of the neuroagent methodology from the casual user, yet at the same time, unleash the strengths of this technology.