All clinical laboratories in the United States must comply with the Clinical Laboratory Improvement Amendments of 1988 (CLIA '88). CLIA '88 has established the minimum standards for all laboratory testing, including specific regulations for quality control. Although CLIA '88 does nor explicitly recommend a method for determining when a system is out of control, the federal law does state that laboratories must:                perform control procedures using at least two levels of control materials each day of testing        establish written procedures for monitoring and evaluating analytical testing processes        follow the manufacturer's instructions for quality control        
To achieve the goal of quality control, including the precision and accuracy of test results, it is necessary to be able to detect errors within the system as soon as possible. Before designing an error detecting system, something must be known about the nature of the errors to be detected. Typically, the errors in a quantitative system are random errors and systematic errors.
Random errors are always present to a measurable degree in any system given a set of circumstances-glucose meters (the devices), operators, test strips (the reagent), and control solutions (the control material), for example. The amount of random error, sometimes referred to as precision, is usually measured by the standard deviation (SD) and the coefficient of variation (CV). The SD measures the scatter (for variability around the true value) in the data points (test results), while the CV is the standard deviation expressed as a percent.
The other type of error is systematic error. These errors, of which shifts and trends are included, occur in one direction away from the true value and are measured by using the mean. Accuracy is the term used when referring to how close a test result is to the true value.
Not only should the error detecting system be able to detect these two types of errors, but it should be able to tell us whether the error is random or systematic because this leads the analysis in a direction which is highly significant.
The manufacturer's stated QC ranges give an indication of where the mean and QC limits may be, but the manufacturer data is not considered an appropriate substitute for a mean and QC-limits determined from the institution's own established data. Each institution should determine the performance of their measurement system and set an appropriate mean and QC limits for the controls based on their own data. New reagent and/or control material should be analyzed for each analyte in parallel with the reagent and/or control material currently in use.
The National Committee for Clinical Laboratory Standards (NCCLS) recommends that as a minimum, 20 data points from 20 or more separate runs be obtained to determine an estimate of mean and standard deviation for each level of control material. A run is typically defined in terms of a length of time or a number of samples analyzed. Better estimates of both mean and standard deviation can be achieved when more data is collected. Additionally, the more controls run, the easier it is to detect true changes in the measurement system.
It is important to include all valid data points attained with the selected collection method. For example, if values outside of 2 SD are not included in the data, an artificially small estimate of variability may be calculated.
If data collection is to be representative of future system performance, sources of variation that are expected and determined acceptable may be included during the data collection period. These may include multiple devices, reagent lots, multiple control material lots, and multiple operators to name a few examples.
The Gaussian distribution, or bell-shaped curve, is the most frequently used model when analyzing clinical data. Using the true standard deviation, statistical theory shows that 99.73% of the data will fall within ±3 SD of the mean, 95.44% will fall within ±2 SD of the mean, and 68.26% will fall within ±1 SD of the mean for each level of control material. (Standard deviation estimates from actual data may vary from the true standard deviation.)
After determining the mean and SD of a measuring system, many institutions will decide to set control limits as some multiple of the SD around the mean, for example, at ±2 SD or ±3 SD to determine when a system is out of control. The problem with this single-rule method is that even if there is no change in the performance of the system and everything is operating as expected (system is in control), you will still have 4.56% (100−95.44=4.56) of values fall outside the ±2 SD limits. These points are considered false rejections, and the more data points you collect, the higher the number of false rejections encountered. Thus, while the ±2 SD offers a very sensitive method to detecting a change, is also presents a real problem-a high rate of false rejection.
Multirule quality control methods use a combination of control rules to more accurately decide whether analytical runs are in control or out-of-control. Unlike the 2-SD or 3-SD limit rules described above, the Westgard Multirule Procedure (Westgard 1938) uses six different control rules to judge the acceptability of an analytical run. The advantages of a multirule QC method are that false rejections can be kept low while at the same time maintaining high error detection.
The following summarizes the individual Westgard 1938 control rules:
RuleDefinition12sOne result falls outside 2 SD.13sOne result falls outside 3 SD.22sTwo consecutive results fall outside 2 SD on the same side of themean.R4sThe range of two results is greater than 4 SD.41sFour consecutive results fall outside 1 SD on the same side of themean.10xTen consecutive results fall on one side of the mean.
To perform multirule QC, start by collecting control data and establish the means and SDs for each level of control material. If performing QC manually (plotting and interpreting data without the use of a computer program), create a Levey-Jennings chart and draw lines at the mean, ±1 SD, ±2 SD, and ±3 SD.
In manual applications, the 12s, rule should be used as a warning to trigger application of the other rules. It indicates that one should look carefully before proceeding. Stop 13s, rule is broken. Stop if the 22s, rule is broken. Stop if the R4s rule is broken. Often the 41s, and 10x rules must be used across runs in order to get a sufficient number of control measurements needed to apply the rules.
A software program should be able to select the individual rejection rules on a test-by-test basis to optimize the performance of the QC procedure on the basis of the precision and accuracy observed for each analytical method and the quality required by the test.