Data sequences often contain redundancy, context dependency and state dependency. Often the relationships within the data are complex, non-linear and unknown, and the application of existing control and processing algorithms to such data sequences does not generally lead to useful results.
Statistical Process Control (SPC) essentially began with the Shewhart chart and since then extensive research has been performed to adapt the chart to various industrial settings. Early SPC methods were based on two critical assumptions:
i) there exists a priory knowledge of the underlying data distribution (often, observations are assumed to be normally distributed); and
ii) the observations are independent and identically distributed (i.i.d.).
In practice, the above assumptions are frequently violated in many industrial processes.
Current SPC methods can be categorized into groups using two different criteria as follows:
1) methods for independent data where observations are not interrelated versus methods for dependent data;
2) methods that are model-specific, requiring a priori assumptions on the process characteristics and its underlying distribution, and methods that are model-generic. The latter methods try to estimate the underlying model with minimum a priori assumptions.
FIG. 1 is a chart of relationships between different SPC methods and includes the following:
Information Theoretic Process Control (ITPC) is an independent-data based and model-generic SPC method proposed by Alwan, Ebrahimi and Soofi (1998). It utilizes information theory principles, such as maximum entropy, subject to constraints derived from dynamics of the process. It provides a theoretical justification for the traditional Gaussian assumption and suggests a unified control chart, as opposed to traditional SPC that require separate charts for each moment.
Traditional SPC methods, such as Shewhart, Cumulative Sum (CUSUM) and Exponential Weighted Moving Average (EWMA) are for independent data and are model-specific. It is important to note that these traditional SPC methods are extensively implemented in industry. The independence assumptions on which they rely are frequently violated in practice, especially since automated testing devices increase the sampling frequency and introduce autocorrelation into the data. Moreover, implementation of feedback control devices at the shop floor level tends to create structured dynamics in certain system variables. Applying traditional SPC to such interrelated processes increases the frequency of false alarms and shortens the ‘in-control’ average run length (ARL) in comparison to uncorrelated, observations. As shown later in this section, these methods can he modified to control autocorrelated data.
The majority of model-specific methods for dependent data are time-series based. The underlying principle of such model-dependent methods is as follows: assuming a time series model family can best capture the autocorrelation process, it is possible to use that model to filter the data, and then apply traditional SPC schemes to the stream of residuals. In particular, the ARIMA (Auto Regressive Integrated Moving Average) family of models is widely applied for the estimation and filtering of process autocorrelation. Under certain assumptions, the residuals of the ARIMA model are independent and approximately normally distributed, to which traditional SPC can be applied. Furthermore, it is commonly conceived that ARIMA models, mostly the simple ones such as AR(1), can effectively describe a wide variety of industry processes.
Model-specific methods for autocorrelated data can be further partitioned into parameter-dependent methods that require explicit estimation of the model parameters, and to parameter-free methods, where the model parameters are only implicitly derived, if at all.
Several parameter-dependent methods have been proposed over the years for autocorrelated data. Alwan and Roberts (1988), proposed the Special Cause Chart (SCC) in which the Shewhart method is applied to the stream of residuals. They showed that the SCC has major advantages over Shewhart with respect to mean shifts. The SCC deficiency lies in the need to explicitly estimate all the ARIMA parameters. Moreover, the method performs poorly for a large positive autocorrelation, since the mean shift tends to stabilize rather quickly to a steady state value, and the shift is poorly manifested on the residuals (see Wardell, Moskowitz and Plante (1994) and Harris and Ross (1991)).
Runger, Willemain and Prabhu (1995) implemented traditional SPC for autocorrelated data using CUSUM methods. Lu and Reynolds (1997, 1999) extended the method by using the EWMA method with a small difference. Their model had a random error added to the ARIMA model. The drawback of these models is in the exigency of an explicit parameter estimation and estimation of their process-dependence features. It was demonstrated in Runger and Willemain (1995) that for certain autocorrelated processes, the use of traditional SPC yields an improved performance in comparison to ARMA-based methods.
The Generalized Likelihood Ratio Test—GLRT—method proposed by Apley and Shi (1999) takes advantage of residuals transient dynamics in the ARIMA model, when a mean shift is introduced. The generalized likelihood ratio may be applied to the filtered residuals. The method may be compared to the Shewhart CUSUM and EWMA methods for autocorrelated data, inferring that the choice of the adequate time-series based SPC method depends strongly on characteristics of the specific process being controlled. Moreover, in Apley and Shi (1999) and in Runger and Willemain (1995) it is emphasized in conclusion that modeling errors of ARIMA parameters have strong impacts on the performance (e.g., the ARL) of parameter-dependent SPC methods for autocorrelated data. If the process can be accurately defined by an ARIMA time series, the parameter independent SPC methods are superior in comparison to non-parametric methods since they allow efficient statistical analysis. If such a definition is not possible, then the effort of estimating the time series parameters becomes impractical. Such a conclusion, amongst other reasons, triggered the development of parameter-free methods to avoid the impractical estimation of time-series parameters.
A parameter-free model was proposed by Montgomery and Mastrangelo (1991) as an approximation procedure based on EWMA. They suggested using the EWMA statistic as a one step ahead prediction value for the IMA(1,1) model. Their underlying assumption was that even if the process is better described by another member of the ARIMA family, the IMA(1,1) model is a good enough approximation. Zhang (1998), however, compared several SPC methods and showed that Montgomery's approximation performed poorly. He proposed employing the EWMA statistic for stationary processes, but adjusted the process variance according to the autocorrelation effects.
Runger and Willemain (1995, 1996) discussed the weighted batch mean (WBM) and die unified batch mean (UBM) methods. The WBM method assigns weights for the observations mean and defines the batch size so that the autocorrelation among batches reduces to zero. In the UBM method the batch size is defined (with unified weights) so that the autocorrelation remains under a certain level.
Runger and Willemain demonstrated that weights estimated from the ARIMA model do not guarantee a performance improvement and that it is beneficial to apply the simpler UBM method. In general, parameter-free methods do not require explicit ARIMA modeling, however, they are all based on the implicit assumption that the time-series model is adequate to describe the process. While this can be true in some industrial environments, such an approach cannot capture more complex and non-linear process dynamics that depend on the state in which the system operates, for example processes that are described by Hidden Markov Models (HMM) (see Elliot, Lalkhdaraggoun and Moore (1995)).
Further information is available from Ben-Gal I., Shmilovici A., Morag G., “Design of Control and Monitoring Rules for State Dependent Processes”, Journal of Manufacturing Science and Production, 3, NOS. 2-4, 2000, pp. 85-93; also Ben-Gal I., Morag G., Shmilovici A., “Statistical Control of Production Processes via Context Monitoring of Buffer Levels”, submitted, (after revision); Ben-Gal I., Singer G., “Integrating Engineering Process Control and Statistical Process Control via Context Modeling”, submitted, (after revision); Shmilovici A. Ben-Gal I., “Context Dependent ARMA Modeling”, Proc. of the 21st IEEE Convention, Tel-Aviv, Israel, Apr. 11-12, 2000, pp. 249-252; Morag G., Ben-Gal I., “Design of Control Charts Based on Context Universal Model”, Proc. of the Industrial Engineering and Management Conference, Beer-Sheva, May 3-4, 2000, pp. 200-204; Zinger G., Ben-Gal I., “An Information Theoretic Approach to Statistical Process Control of Autocorrelated Data”, Proc. of the Industrial Engineering and Management Conference, Beer-Sheva, May 3-4, 2000, pp. 194-199 (In Hebrew); Ben-Gal I., Shmilovici A. Morag G., “Design of Control and Monitoring Rules for State Dependent Processes”, Proc. of the 2000 International CIRP Design Seminar, Haifa, Israel, May 16-18, 2000, pp. 405-410; Ben-Gal I., Shmilovici A., Morag G., “Statistical Control of Production Processes via Monitoring of Buffer Levels”, Proc. of the 9th International Conference on Productivity & Quality Research, Jerusalem, Israel, Jun. 25-28, 2000, pp. 340-347; Shmilovici A., Ben-Gal I., “Statistical Process Control for a Context Dependent Process Model”, Proc. of the Annual EURO Operations Research conference, Budapest, Hungary, Jul. 16-19, 2000; Ben-Gal I., Shmilovici A., Morag G., “An Information Theoretic Approach for Adaptive Monitoring of Processes”, ASI2000, Proc. of The Annual Conference of ICIMS-NOE and IIMB, Bordeaux, France, Sep. 18-20, 2000; Singer G. and Ben-Gal I., “A Methodology for Integrating Engineering Process Control and Statistical Process Control”, Proc of The 16th International Conference on Production Research, Prague, Czech Republic, 29Jul.-Aug. 3, 2001; and Ben-Gal I., Shmilovici A., “Promoters Recognition by Varying-Length Markov Models”, Artificial Intelligence and Heuristic Methods for Bioinformatics, 30 September-12 October, San-Miniato, Italy. The contents of each of the above documents is hereby incorporated by reference.