1. Field of the Invention
The present invention relates to a method and system for advising on several alternatives for recovering from single or multiple point failures in computer systems, and specifically a method and system intended to assist field engineers in the task of recovering from single or multiple component failures in a well-site instrumentation logging system for logging wells.
2. Background
Computer systems typically achieve some degree of reliability by using hardware redundancy. In simplest terms, if portions of a hardware system are critical, redundant or mere image systems are used such that if one system or part of a system fails, the backup system may be used.
Although redundant systems are effective at achieving specified levels of reliability, redundant systems are costly. Thus, there have been efforts at designing systems which can achieve a high level of reliability without total hardware redundancy. One such system is the Basic CSUF System.
The Basic CSUF System is a well logging system utilizing computer hardware which achieves a high degree of reliability without complete hardware redundancy, thereby lowering the costs of the system. That is accomplished by designing the system to be reconfigurable. In other words, the system is designed using distributed computing such that the role of one processor can be shifted to another in the event the former processor fails. Each configuration of the hardware and software represents a distinct way of distributing the computational load of the well logging system.
When a single failure occurs in this system, however, it is usually not immediately apparent how one should reconfigure the system in order to recover. Multiple failures in the system make the task even more difficult because the effects of multiple failure may overlap or interact in complicated ways. The reconfiguration advisor of the present invention enables the operator to respond to single or multiple component failures quickly and effectively by advising the operator which configurations are possible and preferable given a failure scenario.
Various proposals, in general, have been made for using artificial intelligence techniques for various applications. For example, the paper by Marcus, McDermott and Wang "Knowledge Acquisition for Constructive Systems", IJCAI, Vol. 1, 637-39 (1985) describes SALT, a tool designed to assist with problem-solving strategies for an elevator system configurer. A paper by McDermott "R1; An Expert in the Computer Systems Domain", Proceeding of the First Annual National Conference on Artificial Intelligence, 269-271 (1980) describes an application of knowledge-based systems to the problem of hardware configuration. The input to the system is the customer's order and its output is a set of diagrams displaying the spacial relationships between the components on the order. Those diagrams are then used by the technician who physically assembles the system. A paper by Stengel, "Artificial Intelligence and Reconfigurable Control Systems," Sigart Newsletter, 51 (1985) describes research on an artificial intelligence program for the analytic and experimental investigation of reconfigurable control systems as a method to increase reliability in flight control systems. A paper by Griesmer, et al, "Yes/MVS: A Continuance Real Time Expert System", Proceeding of the National Conference on Artificial Intelligence, Vol. 1, 130-136 (1984) suggest a real time control of computer operating systems. Here, however, the system actually interacts with the hardware itself, a requirement given the real time nature of the system. A paper by Nelson, "Reactor: An Expert System For a Diagnosis and Treatment of Nuclear Reactor Accidents", Proceeding of the National Conference on Artificial Intelligence, 296-301 (1982) describes a knowledge based expert system under development called REACTOR, intended to assist operators in the diagnosis and treatment of reactor accidents. The knowledge base is described as containing two types of knowledge: function-oriented knowledge concerning the reactor system and event-oriented knowledge describing the expected behavior of the reactor under accident conditions.
Prior to the present invention, however, artificial intelligent techniques have not been applied to the field of error recovery in well-site instrumentation logging systems.
Well-site instrumentation logging systems typically have much higher component failure rates than other computer systems. Well-site logging system are generally carried on trucks or other similar transport means and, as a consequence, are subjected to abnormally severe physical abuse. Also, well-site logging systems are subjected to extreme environmental conditions ranging from freezing cold weather to extremely hot weather. As a result, such systems experience more repetitive component failures that other systems.
Yet, well-site logging systems must provide a generally higher degree of reliability than other systems. Well-site logging systems must remain fully operable and running for long periods of time without servicing. Further, because of the nature of logging operations, such systems must be able to handle higher data rates than other systems. Additionally, the high costs associated with drilling dictate that logging systems must be reconfigurable "on the fly", namely that they be capable of continuous operation without even slight interruptions and certainly without any loss of data or data processing capabilities.