Affinity assays are important measurement tools used in clinical and research laboratories for determining the concentration of an analyte in a sample. The term "analyte" refers broadly to any substance (a ligand) capable of being detected by formation of a complex with a corresponding anti-ligand. Such complex formation is an integral part of various classes of affinity assays, including for example, immunoassays (such as radioimmunoassays and immunoradiometric assays) and protein-binding assays (such as radioreceptor assays). Thus, the analyte may be an antigen or antibody in an immunoassay, or a protein or its cognate in a protein-binding assay.
Such assays may take a variety of formats. In a direct assay format, a complex is formed between the analyte and the anti-ligand. In an indirect assay format such as a competitive assay, a complex is formed between a competitor and the anti-ligand. The competitor competes with the analyte for specific attachment to the anti-ligand.
To detect the presence of an analyte using an affinity assay, one of the members of a complex is labelled with a tag which is capable of being detected. Examples of classes of such tags include radioisotopes, e.g. 125-[I]; enzymes, e.g., horse radish peroxidase; fluorescent compounds; bioluminescent compounds, e.g., luciferin; and chemiluminescent compounds, e.g., acridinium esters. Each tag emits a corresponding experimental indicator, such as gamma radiation, light, fluorescent light or enzyme activity, which may be detected.
The response, i.e., the amount of experimental indicator detected, for a given concentration of analyte in a sample varies probabilistically. This variation is known as response error, and is Poisson for gamma radiation and is either Normal or log-Normal for the other types of experimental indicators noted above. The amount of variation may change with the mean level of the response. For a given concentration of analyte in a sample the response may also vary due to variation in the performance of the assay from one assay run to the next or due to variation in the preparation of the assayed samples. This variation may be due to errors in pipetting or mixing, incomplete separation between bound and unbound sample components, variation in reagent quality, or variation in response measurement time, incubation, temperature or centrifuging conditions. This variation introduces a random error in the concentration of the analyte. The relative error due to such experimental variation is experimental error.
Because of both response error and experimental error, measuring the concentration of an analyte in a sample using an affinity assay involves an application of formal statistical analysis based on the responses from data from a number of samples, which ensures accurate data interpretation and maintains quality control. Statistical analysis is also used to determine the accuracy of an inferred concentration.
Conventionally, using an affinity assay to measure an unknown concentration of analyte in a sample involves analyzing three sets of samples: standard samples, control samples, and the unknown analyte sample. Standard samples are prepared by the assay kit manufacturer in accordance with World Health Organizations (WHO) or the United States National Bureau of Standard (USNBS) specifications and contain known concentrations of the analyte, spanning the range of concentrations which are sufficient to establish a standard curve from which the concentration of the unknown sample can be inferred. Preferably, this range includes the concentrations which are believed to be important. The control samples contain known concentrations of the analyte and are used to chart assay stability and experimental error. Typically, the controls are prepared from samples assayed previously in the current laboratory.
Typically, a large number of unknown samples are assayed together with the standards and controls in what is commonly called a single assay run. In a single assay run, standard samples are assayed first in replicate to obtain a measurement of an amount of the experimental indicator (i.e., a response) by each replicate. For example, with a radioligand binding assay the number of radioactive counts emitted in a given time period by each sample and its replicate is recorded. A standard curve, which relates a known concentration of analyte to an expected response, is estimated, in a manner to be described in more detail below, based on only the known concentrations and corresponding responses for the replicates of the standard samples.
The samples containing unknown concentrations of analyte are then assayed, typically in replicate, along with replicates of the control samples interspersed between the unknown samples. The perception that better quality control can be achieved by automated assay procedures, the rising cost of assaying large numbers of samples and the desire to reduce radioactive waste (resulting from assays using radioisotopes) production have led many laboratories to assay unknown samples as singlets, not replicates. The response for each unknown and control sample replicate is recorded. The analyte concentration in each unknown sample is inferred by finding the ordinate on the estimated standard curve which has the response for the unknown sample as its abscissa.
The standard curve is conventionally estimated by fitting a curve to the known concentrations for the replicates of only the standard samples and their associated responses using a non-linear, weighted least squares technique. The curve has empirically been found to correspond to a four-parameter logistic model (4-PL). This curve sometimes approximates the curve described by the mass-action law for the underlying chemical reaction.
A conventional model used for estimating the standard curve may be described in the following manner:
Let H denote a concentration of analyte and let h=logH. The log scale is commonly used to describe many analyte concentrations since the biologically possible range of concentrations often covers several orders of magnitude. Let Y be the response recorded from assaying sample. For notational purposes, the control, the standard and the unknown samples are denoted respectively as groups 1, 2 and 3. Let N.sub.i be the number of samples in group i,H.sub.i =[H.sub.i,1, . . . , H.sub.i,Ni ].sup.T be the analyte concentrations in the i.sup.th group samples, h.sub.i =[h.sub.i,1, . . . , h.sub.i,Ni ].sup.T be the log analyte concentrations of the i.sup.th group samples and let ##EQU1## be respectively the N.sub.i .times.q matrix of measurements obtained from measuring the amount of experimental indicator emitted by the samples assayed in group i, for i=1,2,3, where q is the number of replicates (usually 1 or 2).
In an immunoassay or protein-binding assay, the expected measure of experimental indicator is usually a monotonic function of the analyte concentration, to which a four parameter logistic (4-PL) model has been found empirically to fit well. A parametrization of the 4-PL model is: ##EQU2## where .theta.=[max, .gamma., .beta., min].sup.T. Other parametrizations are also well-known. If .epsilon. is set to be -.gamma., with .rho.=exp(.beta./.gamma.), and .theta.*=[max, .epsilon., .rho.,min].sup.T then (2) may be rewritten as ##EQU3## which is the conventional "Rodbard" model. See either "Statistical Analysis of Radioligand Assay Data," Methods in Enzymology, Volume 37, 1975, pp. 3-22, by D. Rodbard and G. R. Frazier ("Rodbard") or "Radioligand and Assay,", Biometrics, Volume 32, 1976, pp. 721-740, by D. J. Finney ("Finney 1976") for a description of this model. These references and all others cited in this document, are expressly incorporated by reference. A graph representing equation (2) is shown in FIG. 1, where the ordinate represents the concentration H and the abscissa represents the expected measure E(Y).
The 4-PL model is useful because it seems to fit empirically the mass-action law equations governing the kinetics of some affinity assays under some conditions which have been described in the art. See, for such a description, "Interrelations of the various mathematical approaches to radioimmunoassay," Clinical Chemistry, Volume 29, 1983, pp. 284-289, by A. A. Fernandez et al. ("Fernandez"), or "Response curves for Radioimmunoassay," Clinical Chemistry, Vol. 29, 1983, pp. 1762-1766, by D. J. Finney ("Finney 1983") or "A Comparison of Logistic and Mass-Action Curves for Radioimmunoassay Data," in Clinical Chemistry, Vol. 29, 1983, pp. 1757-1761 ("Raab").
After a standard curve is estimated using the measured responses for the standard samples, a concentration of analyte is estimated for each of the unknown samples using the estimated standard curve. The average concentration of replicates, computed from the individual estimates, is reported as the estimated concentration in the unknown sample, provided that the individual estimates do not differ appreciably from each other.
The average and the standard deviation of the individual estimated concentrations for the replicates of an unknown sample are used to compute the intra-assay coefficient of variation for the estimated analyte concentration. That is, they are used to quantify a measure of accuracy of the estimate. The estimates of the concentrations for the control samples are used to compute the intra-assay coefficients of variation at selected concentrations throughout the range of the assay. The inter-assay coefficients of variation are computed from the estimated concentrations for the control samples obtained from different assay runs.
Nearly every laboratory in the world which uses affinity assays uses some form of the method described above. However, several theoretical and practical problems with this method exist.
First, for most assays the experimental error is neither routinely nor formally assessed on the analyte concentration scale. One current approach to accounting for variations in the response beyond that due to expected response error from known physical properties of the tag is to model all variation in the response data as a simple polynomial function of the expected response for a given analyte concentration, i.e. as response error. If the polynomial order is one and there is no constant term, the data contains only response error, whereas if it greater than one, there is extra response variation (either Poisson or Gaussian) as well. For many radioligand binding assays the best choice of exponent has been found to lie between one and two (See Finney 1976).
Rodbard also suggests the "relative error model" as a method for studying the effect of experimental variation on the extra-Poisson variation in the response data of a radioligand binding assay. See also "Statistical aspect of radioimmunoassay," by D. Rodbard, in Principles of Competitive Protein Binding Assays, Ed. W. D. Odell et al., Philadelphia: Lippincott, 1971, pp. 204-259, ("Rodbard 1971"), for a similar suggestion. With this method, any experimental variation is represented by an estimate of the amount of extra-Poisson variation in the observed measure of radioactivity of the standards as determined by the type of weights used in fitting the standard curve with nonlinear regression methods. Thus, experimental variation is modelled in terms of the response instead of explicitly as error in the concentration of the samples being assayed.
It is important to estimate experimental variation because it affects the measurement accuracy of all unknown samples, and gives information to laboratory personnel concerning the consistency and quality of materials and technique. Because of the manner in which experimental variations are considered by models used in these methods, experimental error cannot be accurately determined on the analyte scale.
A second problem with conventional methods is that it is not possible to obtain a reasonable determination of the accuracy of the estimated analyte concentration for a singlet. Therefore replicate unknown samples must be assayed.
Third, because both the response and the concentration of analyte are random variables, directly inverting the estimated standard curve describing the expected response for a given analyte concentration does not describe correctly the expected analyte concentration for a given response. See Non-linear Regression, by G. A. F. Seber and C. J. Wild, (New York: John Wiley & Sons), 1989, pp. 247-250 ("Seber"). Although any resulting inaccuracy in concentration estimates is minimal, the effect of this inaccuracy on the determination of the accuracy of these estimates is substantial, as described below.
Fourth, because the minimal detectable dose (MDD) is not determined as part of a single assay run, a predetermined MDD does not correctly represent the smallest quantity of analyte detectable by a given assay run. The assay MDD or sensitivity (defined as the smallest analyte concentration which the assay can just distinguish from an apparent blank concentration) is usually determined when the assay is first prepared by assaying a large number, e.g., 20 to 40, of replicates of samples with an apparent blank concentration (blank samples). With conventional assay procedures, the standard curve is estimated for a single assay run from ten to twelve observations, i.e., with five or six standards each assayed in duplicate, of which only one standard, and thus two observations, are from blank samples. Because the MDD is predetermined using a number of standards far in excess of the number used on a day-to-day basis in the laboratory, it cannot reliably be used as the MDD for a single assay run. Instead, the practice in many laboratories is to report only the analyte concentrations which exceed that of the smallest non-blank standard sample.
The definition of the MDD provided by Rodbard 1978 is a conventional approximation to the upper limit of the 1 - .alpha. highest probability density (HPD) interval, where .alpha. is between 0 and 1, for the apparent blank analyte concentration and as such, considers the uncertainty in the determination of the apparent blank analyte concentration. However it does not consider the uncertainty in any other analyte concentrations. For further discussions of the MDD see "Statistical estimation of the minimal detectable concentration for radio immunoassays" by D. Rodbard in Analytical Chemistry, Vol. 90, 1978, pp. 1-12, ("Rodbard 1978"); "Determining the lowest limit of reliable assay measurement," by L. Oppenheimer et al., in Analytical Chemistry, Vol. 55, 1983, pp. 638-643 ("Oppenheimer"); "Variance functions and minimal detectable concentration in assays," in Biometrika, Volume 75, Number 3, 1988, pp. 549-556, by M. Davidian et al. ("Davidian").
Finally, the validity of the estimation of accuracy obtained using conventional methods is questionable. While conventional methods allow for a systematic analysis of data, these methods rely on and are derived from large sample theory. The validity of inferences for affinity assays using large sample approximations is questionable since most standard curves are estimated from not more than a dozen observations, not from a large sample. This problem has been discussed in the art. See, for example, Davidian.
Despite these problems, all laboratories currently use these conventional methods. An alternative method for obtaining an estimated concentration for an unknown sample, the application of Bayes' rule, has been suggested in "A Note on the Problem of Statistical Calibration" by T. Lwin and J. S. Maritz in Applied Statistics, Volume 29, pp. 135-141, 1980 ("Lwin"). To use Bayes' rule, a specification of a prior distribution for the analyte concentration in the unknown sample is required. However, the prior density specified by Lwin has been discredited as unrealistic for practical applications. (See Seber, p. 248). The prior density specified by Lwin is particularly unrealistic for use in measuring analyte concentrations using affinity assays.
A Bayesian approach to the estimation and use of a standard curve has been discussed generally in "A Bayesian Approach to Calibration" by I. R. Dunsmore, Journal of the Royal Statistical Society, Series B, Volume 31, pp. 396-405, 1968 ("Dunsmore"). However, in the standard curve described in Dunsmore, the expected response is assumed to be directly proportional to the quantity to be estimated. As such it is wholly inapplicable and too simplistic for affinity assays, for which a standard curve is typically described by a sigmoid fraction, such as the 4-PL model.