The present invention relates generally to machine diagnostics, and more specifically to a system and method that improves diagnostic accuracy by presenting fault and operational data in a chronologically ordered file.
A machine, such as a locomotive or other complex system used in industrial processes, medical imaging, telecommunications, aerospace applications, and power generation may include controls and sensors for monitoring the various systems and subsystems of the machine recording certain operational parameters and generating a fault indication when an anomalous operating condition occurs. Certain of these anomalous conditions may require the imposition of operational restrictions on the machine, without requiring a complete and total shutdown. In any case, because restricted operation can be costly, it is essential to accurately diagnose and quickly repair the machine.
Such complex machines may generate an error log (or fault log), containing information related to a malfunction. The field engineer called to diagnose and repair the machine will first consult the error log to assist with the diagnosis process. The error log presents a xe2x80x9csignaturexe2x80x9d of the machine""s operation and can be used by the repair technician to identify specific malfunctions and the operational parameters of the machine before, during and after the fault occurred. Using her accumulated experiences at solving machine malfunctions, the field engineer reviews the error log, identifies the root cause of the fault and then repairs the machine to correct the problem. If the diagnosis was accurate, the repair will correct the machine malfunction. When the error log contains only a small amount of information and the machine is relatively simple, this manual process will work fairly well. However, if the error log is voluminous and the machine is complex, some entries may have an uncertain relationship or perhaps no relationship to the malfunction. It will therefore be difficult for the field engineer to properly review and analyze all the information and successfully diagnose the fault.
To overcome the problems associated with evaluating large amounts of data in error logs, computer-based diagnostic expert systems have been developed and put to use. These diagnostic expert systems are developed by interviewing field engineers to determine how they proceed to diagnose and fix a machine malfunction. The interview results are then translated into rules and procedures that are stored in a repository, which forms a either a rule base or a knowledge base for machine repairs. The rule or knowledge base operates in conjunction with a rule interpreter or a knowledge processor to form the diagnostic expert system. Based on information input by the technician, the rule interpreter or knowledge processor can quickly parse information in the rule or knowledge base to evaluate the operation of the malfunctioning machine and provide guidance to the field engineer. One disadvantage associated with such conventional diagnostic expert systems is the limited scope of the rules or knowledge stored in the repository. The process of knowledge extraction from experts is time consuming, error prone and expensive. Finally, the rules are brittle and cannot be updated easily. To update the diagnostic expert system, the field engineers have to be frequently interviewed so that the rules and knowledge base can be reformulated.
Another class of diagnostic systems use artificial neural networks to correlate operational and fault data with potential root causes. An artificial neural network typically includes a number of input nodes, a layer of output nodes, and one or more xe2x80x9chiddenxe2x80x9d layer of nodes between the input and output nodes. Each node in each layer is connected to one or more nodes in the preceding and the following layer. The connections are via adjustable-weight links analogous to variable coupling-strength neurons. Before being placed in operation, the artificial neural network must be trained by iteratively adjusting the connection weights, using pairs of known input and output data, until the errors between the actual and known outputs, based on a consistent set of inputs, are acceptably small. A problem with using an artificial neural network for diagnosing machine malfunctions, is that the neural network does not produce explicit fault correlations that can be verified by experts and adjusted if desired. In addition, the conventional steps of training an artificial neural network do not provide a measure of its effectiveness so that more data can be added if necessary. Also, the effectiveness of the neural network is limited and does not work well for a large number of variables.
Case-based reasoning diagnostic expert systems can also be used to diagnose faults associated with malfunctioning machines. Case-based diagnostic systems use a collection of data, known as historical cases, and compare it to a new set of data, a new case, to diagnose faults. In this context, a case refers to a problem/solution pair that represents the diagnosis of a problem and the identification of an appropriate repair (i.e., solution). Case-based reasoning (CBR) is based on the observation that experiential knowledge (i.e., memory of past experiences) can be applied to solving current problems or determining the cause of current faults. The case-based reasoning process relies relatively little on pre-processing of raw input information or knowledge, but focuses instead on indexing, retrieving, reusing, comparing and archiving cases. Case-based reasoning approaches assume that each case is described by a fixed, known number of descriptive attributes and use a corpus of valid historical cases against which new incoming cases can be matched for the determination of the root cause of the fault and the generation of a repair recommendation.
Commonly assigned U.S. Pat. No. 5,463,768 discloses an approach to fault identification using fault or error log data from one or more malfunctioning machines using a CBR approach. Each of the historical error logs contains data representative of fault events occurring within the malfunctioning machines. In particular, a plurality of historical error logs are grouped into case sets of common malfunctions. Common patterns, i.e., identical consecutive rows or strings of error data in the case sets are used for comparison with new error log data. In this comparison process, sections of data in the new error log that are common to sections of data in each of the historical case sets (the historical error logs) are identified. Since the historical error logs have been correlated with a specific repair having a high probability of resolving the fault, the common sections of data in the historical error logs and the new error log can lead to a recommended repair with a high probability of resolving the fault.
U.S. Pat. No. 6,415,395, entitled xe2x80x9cMethod and System for Processing Repair Data and Fault Log Data to Facilitate Diagnosticsxe2x80x9d, assigned to the assignee of the present invention and herein incorporated by reference, discloses a system and method for processing historical repair data and historical fault log data, where this data is not restricted to sequential occurrences of fault log entries, as in the commonly owned patent described above. This system includes means for generating a plurality of cases from the repair data and the fault log data. Each case comprises a repair and a plurality of related, but distinct faults. For each case, at least one repair and distinct fault cluster combination is generated and then a weight is assigned thereto. This weight value indicates the likelihood that the repair will resolve any of the faults included within that fault cluster. The weight is calculated by dividing the number of times the fault cluster combination occurs in cases comprising related repairs by the number of times the combination occurs in all cases. New fault log data is entered into the system and compared with the plurality of fault log clusters. The repair associated with the matching fault log cluster represents a candidate repair to resolve the new fault. The report output from this system lists the candidate repairs in sequential order according to the calculated weights.
Further, U.S. Pat. No. 6,343.236, entitled xe2x80x9cMethod and System for Analyzing Fault Log Data for Diagnosticsxe2x80x9d, assigned to the same assignee of the present invention and herein incorporated by reference, discloses a system and method for analyzing new fault log data from a malfunctioning machine, again where the system and method are not restricted to sequential occurrences of fault log entries. The fault log data is clustered based on related faults and then compared with historical fault clusters. Each historic fault cluster has associated with it a repair wherein the correlation between the fault cluster and the repair is indicated by a repair weight. Upon locating a match between the current fault cluster and one or more of the historical fault clusters, a repair action is identified for the current fault cluster based on the repair associated with the matching historical fault cluster.
One particular type of fault that can be advantageously analyzed by certain fault analysis and diagnostic tools is the so-called xe2x80x9cno trouble foundxe2x80x9d fault. Failure conditions that are difficult to diagnose within a complex system may result in such a declaration of no trouble found. Typically, the system experiences intermittent failures and when the system is taken out of service for diagnosis, there is no evidence of a fault or failure, i.e., the fault is intermittent. In this situation, the repair technician declares that the system is failure free and ready for return to service, i.e., no trouble found. Later, the system may experience a repeat failure due to the same problem.
It is believed that the fault and repair analysis tools disclosed in the patent applications described above provide certain advantages and advancements in the art of the diagnostics of complex machines. It would be desirable, however, to provide a system and method to improve the evaluation and identification of faults by undertaking the analysis on a chronologically ordered set of fault data and machine operational information.
The search for an effective tool to diagnose failures occurring in a complex system, such as a railroad locomotive, has been an elusive one. Several such tools are discussed above. The present invention describes a process that systematically combines fault log and machine operational information (sensor readings of various operational parameters commonly referred to as data packs) with maintenance and repair data and fault declaration information into a single chronologically ordered file. With the data ordered chronologically, the present invention offers a more effective tool to diagnose the exact nature and cause of a fault. Combining all of this information, as taught by the present invention, presents the repair analyst with a single file from which the state of the machine can be determined at any time prior to and after the occurrence or recording of a specific fault. By identifying anomalies and analyzing operational parameters occurring immediately prior to the fault (and in some cases, after the fault), the analyst can identify the most probable cause. Additionally, by reviewing the data pack information recorded immediately after a repair or other maintenance action, the analyst can determine the effectiveness of that repair or maintenance activity. The chronologically ordered data presentation also allows the detection of incipient faults based on anomalous conditions and operational parameters. Analysis rules can then be established for later use in recognizing potential fault conditions in the operational data. The present invention can also help to create data filters so that only the information identified as pertinent to a particular fault is analyzed. The filtered out data can be ignored due to its apparent lack of correlation with a specific fault. Using the teachings of the present invention increases the accuracy of failure diagnosis, resulting in more effective (e.g., more timely and less costly) troubleshooting and repair actions. Fewer repeat failures will occur because there will be fewer misdiagnosed cases and fewer no trouble found faults.