Sensor-based monitoring can be used in a variety of industrial settings. Power generating systems, manufacturing processes, and a host of other industrial operations involving the coordinated functioning of large-scale, multi-component systems can all be efficiently controlled through sensor-based monitoring. Indeed, sensor-based monitoring can be advantageously employed in virtually any environment in which various system-specific parameters need to be monitored over time under different conditions.
The control of a system or process typically entails monitoring various physical indicators under different operating conditions, and can be facilitated by sensor-based monitoring. Monitored indicators can include temperature, pressure, flows of both inputs and outputs, and various other operating conditions. The physical indicators are typically monitored using one or more transducers or other type of sensors.
An example of a system with which sensor-based monitoring can be advantageously used is an electrical power generation system. The generation of electrical power typically involves a large-scale power generator such as a gas or steam turbine that converts mechanical energy into electrical energy through the process of electromagnetic induction to thereby provide an output of alternating electrical current. A power generator typically acts as reversed electric motor, in which a rotor carrying one or more coils is rotated within a magnetic field generated by an electromagnet. Important operating variables that should be closely monitored during the operation of a power generator include pressure and temperature in various regions of the power generator as well as the vibration of critical components. Accordingly, sensor-based monitoring is a particularly advantageous technique for monitoring the operation of a power generator.
Regardless of the setting in which it is used, a key task of sensor-based monitoring can be to evaluate data provided by a multitude of sensors. This can be done so as to detect and localize faults so that the faults can be corrected in a timely manner. With a power generating plant, in particular, the timely detection of faults can prevent equipment damage, reduce maintenance costs, and avoid costly, unplanned plant shut-downs.
Monitoring typically involves receiving sensor-supplied data, which can be mathematically represented in the form of sensor vectors. These sensor vectors provide data input into a model and are compared with estimated output values obtained by applying the model to the data input. Large deviations between the actual values of the sensor vectors and the estimated values generated by the model can indicate that a fault has occurred or is about to occur. Accordingly, accurate monitoring can depend critically on the accuracy of the model employed.
There are principally two approaches to constructing such a model. The first approach is referred to as principle or physical modeling, and involves constructing a largely deterministic model representing the physical phenomena that underlie the operation of a particular system or process. It can be the case, however, that the physical dimensions of the system are too numerous or too complex to lend themselves to an accurate representation using the physical model. Accordingly, it is sometimes necessary to resort to the second approach, that of statistical modeling. Sensor-based monitoring of a power generation system, largely because it can require the use of literally hundreds of sensors, can necessitate the construction of such a statistical model. Constructing a statistical model involves “training” a probabilistic model using historical data samples of the system. The purpose of training the model is to glean from the historical data the distribution of the sensor vectors when the system is operating normally.
An oft-overlooked fact with respect to conventional statistical modeling is that just as the actual monitoring depends critically on the accuracy of the model employed, so, in turn, the accuracy of the model depends critically on the data set used to train the model. Several drawbacks inherent in statistical modeling flow inevitably from difficulties associated with acquiring good data for training a model, especially in the context of monitoring a power generation system, for example.
Firstly, it is often not known whether there is a fault that has occurred during the training period in which data was collected. If there has been, then the inclusion of that data will obscure faults that may occur during actual testing or monitoring of a system or process.
Secondly, even if the training data is fault free, there can yet be large variations within the set of training data. This can occur if the data is collected during different modes of operation of a system. For example, in the context of a power generation system, the power generator can be operated in both a full-load (or base) mode as well as a part-load mode. Because these operating modes are sufficiently different, the resulting training data will likely exhibit significant variability. This makes the difficult task of modeling a complex sensor vector distribution with a single model all the more problematic.
Thirdly, the training data can include data generated during transition periods as the system transitions from one mode of operation to another. For example, in the context of a power generation system, data collected during the time period in which the power generator is in transition between states will inevitably reflect an other-than-normal physical state of the generator. Inclusion of such data among the set of training data, accordingly, can skew the resulting model.
Conventional models have typically been constructed using simple threshold rules, with different thresholds set for individual sensors. Models so constructed generally tend to neglect the inherent problems already described. They also tend to obscure the fact that constructing models using conventional techniques with data that has wide variability results in a second-best trade-off. This trade-off can necessitate a choice between relying on a limited, threshold-based model or, alternatively, constructing multiple models from a data set that excludes relevant data.
Accordingly, there is a need for a system and method directed to the selection of data for training a model, especially one that can be used for sensor-based monitoring of a power generator or similar type system. Moreover, there is a need for a system and method that addresses the problem of having to either construct a limited threshold-based model or construct multiple models on the basis of a reduced data set.