A. Field of the Invention
The present invention relates generally to spectral calibration and, more particularly, to spectral calibration resulting from monitoring a mixture containing multiple spectrally-distinguishable species.
B. Description of Related Art
Many materials express light when excited by incident light. The characteristics of expressed light in response to various excitation light wavelengths help identify the material. Different response spectra can also identify different properties of the material and may be used to differentiate components within a cell. To aid in the identification, cellular components, such as DNA fragments, are labeled with different fluorescent dyes that have distinct spectral responses, making the components spectrally distinct.
Spectral calibration estimates reference spectral profiles (i.e., reference spectra) of particular fluorescent dyes using a fluorescent separation system. Conventional approaches to spectral calibration measure the spectral profile of each fluorescent dye separately. These approaches result in reduced throughput and increased system complexity because they require N lanes on gel-based mediums and N separate runs on capillary-based mediums. Furthermore, with conventional reaction kit products, such as the PE Biosystems Big Dye Terminator.TM. kit, the reagents for the reactions (in this case, for DNA sequencing) are pre-mixed, making it impossible to perform separate dye-specific calibration runs using the actual reaction-kit products. This problem may therefore necessitate the manufacturing of special-purpose calibration kits and the design of a corresponding instrument calibration protocol in order to obtain the reference spectra in an automated way.
While this is feasible, it results in greater expense and/or greater complexity to the end-user. In addition, from a manufacturing standpoint, there is the added burden of quality control on the production of the calibration kits. For most applications, the reference spectra obtained using the calibration kit must match the "true" spectra exhibited by the target reaction kit products to a high degree of accuracy, usually to within 5% or less in terms of miscalibration error (see below). While meeting such a specification is certainly achievable, chemistry and/or manufacturing differences that distinguish the calibration-kit products from those generated using the target reaction kits provide a greater opportunity for the introduction of systematic miscalibration error. Examples may include (1) differences in the manner in which the dyes are chemically attached to the cellular components, and (2) the possible introduction of low levels of dye-label to fragment-size cross contamination in the manufacturing of the calibration kit products. In this regard, the "calibration kit" approach is in contradistinction to spectral calibration performed using dye-labeled cellular components (e.g., DNA) generated from the target reaction kits themselves, with the aid of additional automated data-processing algorithms.
A situation in which the spectra measured in a calibration run differ slightly from the spectra exhibited by the products of the target reaction kit leads to systematic error in subsequent production-run data processing. Typically, the calibration matrix is used in a first step of processing the raw fluorescence-intensity data in a procedure known as multicomponenting. Assume, for example, that the instrument collects fluorescence intensity values during each instance of measurement over several predefined ranges of light wavelength (i.e., virtual filters). The raw-data electropherogram constitutes the sequential collection of all such measurements made by the system during the electrophoresis run. Multicomponenting, then, consists of a matrix multiplication of each measurement "vector" by the inverse of the calibration matrix. If the number of virtual filters collected from the instrument is greater than the number of dye signals to be computed, the pseudo-inverse of the calibration matrix is used in place of the ordinary inverse. This process is intended to transform the intensity data from the vector space of relative intensity in each of the virtual filters to the vector space of relative dye-label concentrations in a way that is mathematically optimal in a least mean-squares sense.
The manifestation of calibration error in the multicomponenting process is a phenomenon known as "pull-up" or "pull-down" (PU/PD) in the transformed intensity data. In a region of an electropherogram where there exists a peak in a single dye-labeled component, PU/PD can exhibit a false peak or "negative peak" in one or more of the alternate dye signals. Given that a peak in one of the dye signals of the transformed data otherwise corresponds to the passage of a particular class of cellular component past the detection system, positive peaks in alternate colors may be interpreted by the subsequent analysis as representing a low-level presence of the corresponding dye-labeled component, when in fact there is none. This type of error can, therefore, introduce a signal indicating the false presence of another dye-labeled species.
Furthermore, should the miscalibration manifest itself as pull-down (negative peaks), heuristic algorithms typically used to determine the baseline (background) signal in each dye/color channel as a function of time may be adversely affected. This can then lead to further problems in the subsequent processing of the data, such as the appearance of additional false peaks due to band-pass filtering or deconvolution of elevated baseline regions, etc. Depending upon the quality of the data being processed, a low level of false peaks introduced by spectral miscalibration error may drastically lower the effective signal-to-noise ratio of the electrophoretic measurements.
If low-level peaks cannot otherwise be classified as real peaks or false ones, they will effectively define a threshold of true peak detection. Therefore, applications demanding a large dynamic range of detection may be adversely affected or rendered infeasible if the spectral calibration cannot be performed accurately. A good example of such an application is sequencing for polymorphism (heterozygote) detection. In such an application, heterozygous positions in a sequence are manifest as truly (or nearly) overlapping peaks in the electropherogram. Certain protocols, such as when pooling patient samples, demand a low threshold of detection and correspondingly large dynamic range.
Other examples where PU/PD systematic error can introduce classification errors in subsequent analysis include alelle identification for fragment analysis applications (e.g., GeneScan.TM.). These types of error may adversely affect genetic mapping projects or Human Identification analyses.
As a result, a need exists for new approaches to spectral calibration that overcome the deficiencies of the conventional approaches.