Knowledge-Based and Model-Based System for Monitoring and Control
Process industries, including the pharmaceutical, biotechnology, chemical, food, environmental and others may save millions of dollars by using artificial intelligence for process optimization to control complex productions facilities. Using large-scale cultivation of microorganisms or mammalian cells are extreme cases in terms of complexity, when considering then as the individual manufacturing plants involved in complex chemical synthesis. Current systems monitor very general types of phenomena, such as gas pressure, pH, and in some occasions, the concentration of some product that correlates with cell growth or production, but those parameters are usually poor indicators of how much of the desired product is produced. Other methods for designing monitoring and control systems for laboratory and industrial applications have been described, such as the one described in the patent application published as EP 0 367 544 A2 (Int. Dev. Res. Center) 9 May 1990, which uses a graphical interface to graphically model the set of instruments and controllers of such monitoring and control systems and a natural language to allow the integration of the knowledge of experts into the automated control facilities. Most monitoring systems are concerned with the overall processes that occur within the physical constraints of given reactions tanks, but do not model the many compartmentalized subsystems contained in each of those tanks with biological systems, to more finely tune the productivity of those subsystems. Complex mixtures of chemical reactions can be finely controlled externally by modifying the types and amounts of inputs added, if one could predict what will happen by adding those inputs, which requires a good knowledge and a model of such system of reactions. This is particularly the case with biological cellular systems that have very sophisticated methods to transduce the signals provided by ligands in their external environment to the interior of the cell, resulting in the execution of specific functions. Such detailed and accessible mechanistic models of those pathways of reactions are not currently used for monitoring and control systems, but would be highly desirable.
Several knowledge-based systems for monitoring and control functions have been used to include the knowledge of experts into the automated control of production facilities. A knowledge-based system interprets data using diverse forms of knowledge added to the system by a human domain expert including: a) shallow knowledge or heuristics, such as human experience and interpretations or rules-of thumb; and b) deep knowledge about the system behavior and interactions. The systems that mainly based in the first type of knowledge are in general referred to as knowledge-based expert systems, and the logic is represented in the form of production rules. In the more advanced real-time expert systems, inferencing techniques are usually data-driven using forward chaining, but can also employ backward chaining for goal-driven tasks and for gathering data. The inference engine searches for and executes relevant rules, which are separate from the inference engine and therefore, the representation is intrinsically declarative.
Object-oriented expert systems allow a powerful knowledge representation of physical entities and conceptual entities. In those systems, data and behavior may be unified in the class hierarchy. Each class has a template that defines each of the attributes characteristic of that class and distinguish it from another types of objects. Manipulation and retrieval of the values of the data structures may be performed through methods attached to an object's class. Model-based systems can be derived from empirical models based on regression of data or from first-principle relationships between the variables. When sufficient information to model a process—or part of it—is available, a more precise and compact system can be built.
There is a number of commercially available shells and toolkits that facilitate the development and deployment of domain-specific knowledge-based applications. Of those, real-time expert-system shells offer capabilities for reasoning on the behavior of data over time. Each of the real-time object-oriented shells from various vendors offers its set of advantages, and each follows a different approach, such as compiled versus interpreted, and offers a different level of graphic sophistication. The specific shell currently selected for the implementation of this invention is Gensym Corporation's G2 Version 3.0 system, and in part Version 4.0, which is designed for complex and large on-line applications where large number of variables can be monitored concurrently. It is able to reason about time, to execute both time-triggered and event-triggered actions and invocations, to combine heuristic and procedural reasoning, dynamic simulation, user interface, database interface capabilities, and other facilities that allow the knowledge engineer to concentrate on the representation and incorporation of domain-specific knowledge to create domain-specific applications. G2 provides a built-in inference engine, a simulator, prebuilt libraries of functions and actions, developer and user-interfaces, and the management of their seamless interrelations. A built-in inspect facility permits users to search for, locate, and edit various types of knowledge. Among G2's Inference Engine capabilities are: a) data structures are tagged with time-stamp and validity intervals that are considered in all inferences and calculations, taking care of truth maintenance; and b) intrinsic to G2's tasks are managed by the real-time scheduler. Task prioritization, asynchronous concurrent operations, and real-time task scheduling are therefore automatically provided by this shell. G2 also provides a graphic user interface builder, which may be used to create graphic user interfaces that are language independent and allow to display information using colors, pictures and animation. Dynamic meters, graphs, and charts can be defined for interactive follow-up of the simulation. It also has debugger, inspect and describe facilities. The knowledge-bases can be saved as separated modules as ASCII files. The graphic views can be shared with networked remote CPUs or terminals equipped with X Windows server software.
Computer-Aided Physiological and Molecular Modeling and Artificial Intelligence in Molecular Biology
Most computer-aided physiological and molecular modeling approaches have resulted in computer models of physiological function that are numerical mathematical models that relate the physiological variables using empirically determined parameters. Those models, which can become quite complex, aim at modeling the overall system.
Both molecular biology and medicine have been fields of previous activity in the application of artificial intelligence (AI). In molecular biology, although there were some early systems such as Molgen and Dendral, the activity has intensified recently as a consequence of the explosion in new technologies and the derived data, mostly related with the Human Genome project and the handling of large amounts of sequence data generated, relating to both DNA and proteins. There has also been an increased interest in computer methodologies in 3D structural models of molecular interactions. For a current state of the art, see the topics covered in symposia such as the recent Second International Conference on Intelligent Systems for Molecular Biology, 1994, Stanford University, CA. (its Proceedings are here included by reference). Here, only two projects will be mentioned that have some common objectives with the system that is the object of this invention. Discussions over other previous approaches are also included in those references.
The Molgen group at Stanford University has studied scientific theory formation in the domain of molecular biology, as reported by Karp, P. D. and Friedland, P. (included here by reference). This project relates to the system of this invention in that both “are concerned with biochemical systems containing populations of interacting molecules . . . in which the form of knowledge available . . . varies widely in precision from quantitative to qualitative”, as those authors write. The Ph.D. dissertation of P. D. Karp (included here by reference) “developed a qualitative biochemistry for representing theories of molecular biology”, as he summarizes in an abstract in AI Magazine, Winter 1990, pp 9-10. He developed three representation models to deal with biochemical pathways, each having different capabilities and using different reasoning approaches. Model 1 uses IntelliCorp's KEEframes to describe biological objects and KEE rules to describe chemical reactions between the objects, which he recognizes to have serious limitations because is not able to represent much of the knowledge available to biologists. The objective of Model 2 is to predict reaction rates in a given reaction network, incorporating a combination of quantitative and qualitative reasoning about state-variables and their interdependencies. The drawbacks are that this model is not able to incorporate a description of the biological objects that participate in the reactions, and it does not have temporal reasoning capabilities, representing just a static description of the state variables and their relationships. The third model, called GENSIM and used for both prediction and hypothesis formation, is an extension of Model 1 and is composed of three knowledge-bases or taxonomical hierarchies of classes of a) biological objects that participate in a gene-regulation system, b) descriptions of the biological reactions that can occur between those objects, and c) experiments with instances of those classes of objects. The GENSIM program predicts experimental outcomes by determining which reactions occur between the objects in one experiment, that create new objects that cause new reactions. Characteristics of the GENSIM program that may be relevant, although different, for the system of this invention are: a) chemical objects are homogeneous populations of molecules, objects can be decomposed into their component parts, and identical objects synthesized during a simulation are merged; b) chemical processes represent reactions between those populations as probabilistic events with two subpopulations, one that participate in the reaction and one that does not. Those processes can create objects and manipulate their properties, but cannot reason about quantitative state variables such as quantities. In his words, “processes . . . specify actions that will be taken if certain conditions hold”, and in that sense are like production-rules; c) restrictions are specified in the form of preconditions for chemical reactions to happen; and d) temporal reasoning is not available, resulting again in static representations and simulating only behavior in very short time intervals.
The system of this invention integrates a variety of forms of knowledge representation, some of them totally novel, while some of these forms may have been treated by other authors similarly in some aspects. However, upon integration into a totally new approach, that treatment becomes a part of a novel representation and innovative system. Regarding the semi-quantitative simulation component this invention, L. E. Widman (1991) describes a semi-quantitative simulation of dynamic systems in a different domain, with the assumption that “ . . . questions can be answered in terms of relative quantities rather than absolute quantities . . . model parameters that are not specified explicitly are given the implicit, default values of ‘normal’ (unity) . . . ”. As it will become clear from the detail descriptions in the following sections, the innovative tools and methods used in the present implementation a requite different. For example, while he assumes that “ . . . the default, or implicit, value of ‘normal’ maps onto unity for parameters and onto zero for variables . . . . ” the assumption in the prebuilt modular components in the current implementation differs in that the default value of ‘normal’ may map onto values other than unity and zero, with those values being defined based on expert knowledge.