Multivariable Predictive Control (MPC) is the most widely used advanced process control technology in process industries, with more than 5,000 worldwide applications currently in service. MPC, which is sometimes also referred to as multivariate control (MVC), employs a model predictive controller that relies on dynamic models of an underlying process, e.g., linear models obtained by system identification.
A common and challenging problem is that MPC control performance degrades with time due to inevitable changes in the underlying subject process, such as equipment modifications, changes in operating strategy, feed rate and quality changes, de-bottlenecking, instrumentation degradation, etc. Such degradation of control performance results in loss of benefits. Among all possible causes of control performance degradation, the process model's predictive quality is the primary factor in most cases. To sustain good control performance, the model's predictive quality needs to be monitored, and the model needs be periodically audited and updated.
To address the technically challenging problem of model quality auditing and online model identification and adaptation, the Assignee developed an innovative approach for model quality estimation and model adaptation (see U.S. Patent Application Publication No. US 2011/0130850 A1, the parent related application), and a new method for non-invasive closed loop step testing (see U.S. Provisional Application No. 61/596,459, filed on Feb. 8, 2012) that expanded the automated closed loop step testing techniques in Assignee's U.S. Pat. Nos. 7,209,793 and 6,819,964. Each of the above techniques help in monitoring model quality and generating informative plant test data in a more efficient way. Once the process data becomes available through either open/closed-loop plant tests or historical plant operation records, a necessary and also important step is data screening and selection for model quality estimation and model identification. There are two important reasons for performing data screening and selection. First, the process data received from open/closed-loop plant tests are costly due to not only the engineer's work during the designed plant testing, but also the intervention (interruption) to plant production. Therefore the usage of those plant test data should be maximized. Second, collected process time series data may contain segments of samples over periods such as unit/equipment shut-downs, measurement equipment errors, variable values at High/Low limits, control output saturated or frozen, etc. If these data are included in the calculation of model quality estimation or model identification, the results can be contaminated and become unreliable.
In industrial practice of the prior art, a control engineer spends hours to days viewing all the process variables in time series plots and visually finding those data samples unsuitable for model identification purpose. To exclude the found “bad” data samples, the control engineer manually marks such data segments in software as “bad data slices” through a pertinent user interface. For confirmation and double-check on whether there are any “bad data” missed manual slicing, the engineer tries a number of model identification runs until the identification algorithm goes through all the data without failures and the resulting models look normal and make sense from the engineer's view. A typical APC project may contain 50-200 process time series variables and the data collection window may cover a period of weeks to months. Conducting the data screening and selection task by hand may take an experienced engineer days to weeks of intensive work.
In addition, there are two other shortcomings by using the conventional approach to data screening and selection for model quality estimation and identification. One is that any of the marked “bad data slices” in a time series will cause a data loss of a large piece of good data (e.g., one time to steady state (TTSS) for a FIR model and 40-60 samples for a subspace model) following the “bad data slice”, as a side effect due to the required re-initialization. The other drawback is that a conventional approach is not suitable for frequent runs in an online application (such as that described in U.S. Patent Application Publication No. US 2011/0130850 A1) where a pre-scheduled automated data screening and selection operation is needed to serve plant testing monitoring. Once a process variable hits its High/Low limit, becomes saturated, or loses measurements, the automated data screening and selection module should alert the operator and report the situation timely, so that the engineer may take actions to make corrections/adjustments and avoid a time and data loss on the plant testing.
There have been some general data preprocessing methods reported in process model identification text books (e.g., Lennart Ljung, “System Identification—Theory for The Users” Second Edition, Prentice Hall, 1999), but there is no systematical method for automated data screening and selection. A recently reported method of data selection from historic data for model identification (Daniel Peretzki, Alf J. Isaksson, Andre Carvalho Bittencourt, Krister Forsman, “Data Mining of Historic Data for Process Identification”, AIChE Annual Meeting, 2011) is focused on finding useful intervals by use of a Laguerre approximation model and limited to only single-input and single-output (SISO) PID (Proportional-Integral-Derivative Controller) loops. In industrial APC practice, based on Applicants' knowledge, automated data screening and selection has been APC (Process Control) engineers' “dream”, where neither systematic solution nor commercial tools are available yet.