The present invention relates to a method and any system using such a method of statistical process control based on multidimensional processing of data.
Its purpose is firstly to trigger a warning when the process departs from xe2x80x9cnormalxe2x80x9d operation which ensures that its production is of the required quality, and secondly to make proposals for identifying the probable cause(s) of the anomaly.
Statistical process control (SPC) is presently in use in a very large number of businesses, in all countries (mainly industrialized countries), for all types of industrial production: engineering, electronics, chemistry, pharmaceuticals, agri-food, plastics materials, . . . .
Its purpose is to ensure product quality by inspecting the manufacturing process itself and not only by inspecting the characteristics of its products. SPC has become essential in achieving xe2x80x9czero defectsxe2x80x9d and when the business seeks to comply with international quality assurance standards (ISO 9000).
Its technical objective is to detect possible drift in the manufacturing process and to remedy it before non-compliant products are manufactured.
The use of this method has now extended beyond the context of manufacturing goods and covers producing services (banking, insurance, consultancy, . . . ).
When running a process (cf. FIG. 1), various measurements (indicators) associated with the same process are tracked: input characteristics (raw materials); output characteristics (products); process operating parameters. Each unit of observation (measurement instant or element produced) is thus associated with a plurality of digital values obtained by the measurements, thus enabling it to be represented by a point in the multidimensional space of the measurements taken.
The usual practice in SPC consists in monitoring the process by tracking a plurality of control charts which are graphical representations of the way an observed magnitude varies and which present predefined control limits (see FIG. 2), one per measurement. Each control chart is then interpreted independently of the others, triggering warnings independently.
Various types of control chart are available (known as xe2x80x9cShewart, CuSum, EWMA, MMExe2x80x9d), with the last three being accepted as being better at detecting small amounts of xe2x80x9cdriftxe2x80x9d than the first.
Generally, control charts are used on grouped data: by plotting the averages of a plurality of grouped-together measurements, small amounts of drift are detected better, and in addition the distribution of the values coincides better with the assumption of normality that underlies the method. By plotting the variances or the extents of each group, it is possible to detect an increase in measurement variability that has some special cause.
The usual practice which consists in simultaneously and independently monitoring a plurality of control charts constitutes a method that is clumsy and not very effective in multidimensional SPC:
it raises too many false warnings which can give rise to unnecessary corrections; these then need to be reassessed very quickly, and lead to the process being controlled in a manner that is chaotic and expensive, with multiple corrections;
it can detect real anomalies too late; and
it has difficulty in detecting the causes of anomalies when they are not directly associated with a measurement. This encourages taking a multiplicity of measurements which is expensive and leads to a multiplicity of control charts.
The method and system of the present invention seek to mitigate those drawbacks: the method is one of statistical process control on the basis of taking indicators or measurements on inputs, on outputs, and on control and operating parameters of said process, and which can be represented by observation points in frames of reference that associate their values with their sampling indices; according to the invention:
a) the observed values are transformed so that the resulting values are compatible with the multidimensional Gaussian distribution model, and constitute data corresponding to the observation points used in the remainder of the method;
b) said observation points are situated in a multidimensional space, in which each dimension is associated with a measured magnitude;
c) amongst the observation points, points that are said to be xe2x80x9cunder controlxe2x80x9d and that correspond to proper operation of the process are distinguished from points which are said to be xe2x80x9cout of controlxe2x80x9d;
d) the distribution center of the points under control is calculated as being the center of gravity of the observation points under control;
e) out of control observation points that are concentrated in some particular direction from the distribution center of the points under control are identified;
f) this direction is associated with a common cause for drift of said process;
g) each observation point and anomaly direction pair is associated with indicators in order to propose zero, one, or more causes of anomaly that are liable to be associated with the observation that has been made; and
h) when an anomaly is analyzed in this way, a warning is triggered and the drift detected in this way in the industrial process is remedied.
Said center of gravity of the observation points being inspected corresponds to a point whose components are the means of the components of the observation points under inspection.
SPC inspection consists in conventional manner in regularly observing p continuous magnitudes y1, y2, . . . , yp either statistically or by sampling. These magnitudes can equally well represent characteristics of raw materials, characteristics of manufactured products, or operating parameters of the manufacturing process. The p-dimensional vector made up of these p measurements at a given xe2x80x9cinstantxe2x80x9d is written y and is referred to as the observation vector of the process, with the endpoint of this vector being the observation point of the process and the origin of this vector being the original of the frame of reference in question.
It is clear that in this context the concept of xe2x80x9cinstantxe2x80x9d goes beyond a strictly temporal interpretation: measurements associated with the same xe2x80x9cinstantxe2x80x9d are, wherever possible, measurement of parameters relating to the production of the same manufactured unit or batch. Perfect traceability of the manufacturing process is then necessary in order to be able to define which measurements are associated with the same xe2x80x9cinstantxe2x80x9d.
When the process is xe2x80x9cunder controlxe2x80x9d, the values of y at various successive instants t0, t0+1, t0+2, . . . vary xe2x80x9clittlexe2x80x9d about a value y0, which is the desired target for ensuring that production is of satisfactory quality. This variation is due to random variations in the characteristics of the raw materials (material hardness, chemical composition of a component, supplier, . . . ), of the environment (temperature, humidity, . . . ), or of the process (setting of a machine, attention of an operator, . . . ). These characteristics have influence over one or more components of y and they are written z1, z2, . . . , zm and together they form a vector written z. The vector z is referred to herein as the explanatory vector of the process.
For a characteristic of the process to be considered as an xe2x80x9cobservation variablexe2x80x9d of the process yj it must be evaluated at each xe2x80x9cinstantxe2x80x9d.
For a characteristic of the process or of the inputs to be considered as a xe2x80x9ccause variablexe2x80x9d zk of the process, it must be modified by an agent external to the system proper: voluntary or involuntary human action, variation in the environment, wear, or aging. Generally, for reasons of expense or of feasibility, these variables are not measured at each xe2x80x9cinstantxe2x80x9d (otherwise they would also appear as variable yj) and in this sense they constitute xe2x80x9chidden variablesxe2x80x9d that influence the behavior of the process. Evaluating them is often expensive, lengthy, imprecise, and is performed only in the event of an anomaly.
A variable can be quantitative if its possible values are numerical and belong to a known range of values (temperature, pressure, . . . ), or it can be qualitative when the possible values, numerical or otherwise, are limited in number (supplier, operator, machine, . . . ). The models and methods considered in the present invention assume that the components of y are all quantitative.
The same characteristic of the process (e.g. the controlled temperature of a furnace) can appear both as a component of z as a component of y.
The dependency between y and z can be modelled by the following relationship:
y=f(z,t)+xcex5
where t designates the observation instant and xcex5 is a random vector of dimension p whose average is assumed to be zero and which has a covariance matrix xcexa3xcex5. f is a vector function having p components f1, f2, . . . , fp such that yi=fj(z,t)+xcex5.
The components of y are correlated with one another as are the components of z.
A perfectly stabilized process under steady conditions ought to present the following aspects:
f(z,t) does not in fact depend on t;
each xe2x80x9ccause variablexe2x80x9d zk is stabilized on a fixed value z0k; and
y can be modelled by a steady process of the form:
y=f(z0)+xcex5
In reality, it is not possible to determine perfectly the quantitative cause variables: the variable z0 has added thereto a random error of zero expectation and of covariance matrix xcexa3e. The model then becomes:
zk=z0k+ek
y=f(z0+e)+xcex5
The method proposed by the invention lies in the following context which is the usual context for SPC:
the function f is unknown;
the explanatory variables zk are not all identified; and
n observations have been made of the variables y1, . . . , yp at the xe2x80x9cinstants t=1, . . . , t=n.
These observations are written in the form of a matrix Y having n rows and p columns. yj designates the jth column of Y; the ith element of this column is written yij and designates the observation made at instant t=i of the variable yj. The vector of observations of the variables y1, . . . , yp at the xe2x80x9cinstantxe2x80x9d t=i is written yi.
The observation yi is given a weight pi, which is generally equal to 1/n. The diagonal matrix (n, n) having the weights pi as diagonal terms is written Dp.
when the process is properly under control:
it is properly centered on the target value y0.
y0 is then equal to the mathematical expectation E[y] of y; y0 is thus very close to the observed mean value my (where my designates the vector constituted by the means {overscore (y)}1, . . . , {overscore (y)}p of the p columns y1, . . . , yp of the matrix Y).
its variability is constant and comparable with the specification limits defined on the various variables yj,
the covariance matrix of the random vector y does not vary in time.
when the process is drifting, the observed values yi move too far away from the target value y0. Such behavior can be the result of:
variation over time in the center value of one or more cause variables zk; or
an increase in time in the variance of one or more randoms xcex5k or ej.
If the drift is qualitative concerning the cause variable zk, causing this variable to go from a value z0k to a value z1k, then the center of the distribution is moved from y0 to y1. The observed points are then moved in the direction y1-y0.
If the drift is quantitative concerning zk and the process is not unstable around y0 relative to zk, then it can be assumed that each function fj has a partial derivative fjk relative to zk. The calculus of variations then shows that to the first order:
drift in the mean, z1k=z0k+d implies that the center of the observed points moves from y0 in the direction defined by the vector of the partial derivatives (f1k(y0), . . . , fjk(y0), . . . , fpk(y0));
an increase in the variability of the random ek will give rise to the distribution of the observed points yi being xe2x80x9cstretchedxe2x80x9d in the same direction:
(f1k(y0), . . . , fjk(y0), . . . , fpk(y0))
an increase in the variability of the random ej gives rise to the distribution of the observed point yi being xe2x80x9cstretchedxe2x80x9d in the direction of the jth basic vector       (          0      ,      …      ⁢              xe2x80x83            ,      0      ,      1      ,      0      ,      …      ⁢              xe2x80x83            ,      0        )        1    ⁢          xe2x80x83        ⁢    j    ⁢          xe2x80x83        ⁢    p  
The method of the present invention seeks and manages to achieve the following technical objectives:
during a historical analysis stage:
identifying the directions associated with any drift that has been identified in the historical record and define parameters making it possible for each observation to calculate proximity indicators for these directions;
during an operational stage of controlling the process:
detecting whether the latest observation reveals any drift in the operation of the process and then, by examining the proximity indicators, identifying the identified cause direction that appears closest to the observed point, thus proposing a probable cause or causes for the drift;
proposing in both stages graphical representations that enable the situation to be evaluated quickly and as a whole.
In order to adapt the system to the particular features of each process, several versions are proposed for each kind of processing performed in the system.