1. Field of the Invention
This invention relates to the field of sequencing nucleic acids.
2. Description of the Prior Art
One of the classic methods of sequencing DNA with a radioisotope label and slab gels was first described by Sanger et al. (F. Sanger, S. Niklen, and A. R. Coulson, Proc. Nat. Acad. Sci. 1977 74: 5643). Four individual sets of sequencing fragments, each terminating in one of the four 2',3'-dideoxynucleoside 5'-triphosphate, ddNTPs (2',3'-dideoxyguanosine 5-triphosphate, ddATP, 2',3'-dideoxycytidine 5'-triphosphate, ddCTP, 2',3'-dideoxyguanosine 5'-triphosphate, ddGTp, and 2',3'-dideoxythyrnidine 5'-triphosphate, ddTTP) are produced via enzymatic extension of a primer hybridized to the template DNA by sequential addition of template complementary 2'-deoxynucleoside 5'-triphosphates, dNTPs (dA, dC, dG and dT). A radioisotope label (.sup.32 P) is incorporated into each fragment via a labeled primer, dNTP, or ddNTP, such that when the four sets of fragments are separated in four adjacent lanes of an electrophoretic slab gel, the fragments can be visualized on photographic film. The resulting pattern that the fragments display must be properly aligned since sequence information is read from the relative position of the bands in the four gel lanes. Migration anomalies arising from bubbles in the gel matrix or "smiles" arising from temperature inhomogeneity within the gel during the separation make sequence reading an art. Highly accurate sequence determinations derived from these gels requires a high level of skill.
More recently, methods employing four different fluorescent labels have been described by Smith et al. (U.S. Pat. No. 5,171,534; Nature 1986 321(12): 674-679) and Prober et al. (U.S. Pat. No. 5,332,666; U.S. Pat. No. 5,306,618; U.S. Pat. No. 5,242,796; Science 1987 238: 336-341). These methods utilize four spectrally distinguishable fluorescent tags, each tag associated with one of the four nucleotide terminators. These tags are used to distinctly label each of the four sets of fragments, which can then be separated in a single electrophoretic run. These methods circumvent the lane alignment problems associated with the original Sanger dideoxy-mediated sequencing protocol, since all four sets of fragments are separated in a single lane of the electrophoretic gel. However, these methods require significantly more complex instrumentation than the original Sanger method since the partially overlapping sets of fragments must be photometrically monitored and the four emission wavelengths, "colors", associated with the four fluorophore labels must be distinguished to allow proper assignment of base identity (adenine (A), cytosine (C), guanine (G), or thymine (T)) and base position within the template DNA. In addition, four chemically-similar but spectrally-distinguishable fluorophore labels must be developed. The four labels must be chemically-similar, otherwise the four labels would impart different mobility shifts to the four sets of fragments resulting in migration order errors. Specific sets of fluorophore dyes useful for labeling the sequencing fragments in these methods are described by Fung et al. (U.S. Pat. No. 4,855,255), Hobbs et al. (U.S. Pat. No. 5,047,519), Menchen et al. (U.S. Pat. No. 5,188,934), Prober et al. (U.S. Pat. No. 5,242,796), and Bergot et al. (U.S. Pat. No. 5,366,860).
The previously described sequencing methods are based on simultaneous separation of a mixture of the four types of terminated sequence fragments and discrimination among the bases by the color of the sequencing fragment band. Numerous other sequencing methods based on separation of various combinations of the four types of terminated sequencing fragments have been developed. Orgel and Patrick (U.S. Pat. No. 4,865,968) demonstrated a sequencing method based on patterns obtained from three distinct sequencing mixtures derived from a template DNA. The first mixture contains oligonucleotide fragments derived from termination by all four 2',3'-dideoxynucteoside 5'-triphosphate terminators. The second mixture contains only those fragments derived from use of a first and a second 2',3'-dideoxynucleoside 5'-triphosphate terminator and the third set contains only those fragments derived from use of a first and a third 2',3'-dideoxynucleoside 5'-triphosphate terminator. These three mixtures are run in adjacent lanes of an electrophoretic slab gel and the DNA sequence of the template DNA is read from the relative migration order of the fragments in the three adjacent separations. These three separations must be properly aligned to obtain accurate sequence information. All fragments in the three mixtures contain a single label type.
Tabor and Richardson (U.S. Pat. No. 4,962,020) described a method of DNA sequencing in which the relative ratios of the 2',3'-dideoxynucleoside 5'-triphosphate terminators are varied in conjunction with the use of a modified T7 DNA polymerase. A manganese ion cofactor enables the maintainence of relatively constant label signal intensity (as measured in peak height) for the various series of oligonucleotide fragments in the mixture. In this way, a single sequencing reaction with different concentrations of ddA, ddC, ddG, and ddT terminators can be used to derive the sequence of a template DNA. A single label (fluorophore) is used in all reaction mixtures.
Ansorge (U.S. Pat. No. 5,124,247) developed a method of nucleic acid sequencing in which a single label is employed to monitor and distinguish individual bands in separations of mixtures of nucleic acids. Two mixtures, one containing ddA and ddG terminated fragments (A+G) and another containing ddC and ddT terminated fragments (C+T), are produced via enzymatic extension of a primer hybridized to the template DNA using a DNA polymerase. A single fluorophore label is attached to the primer or to one of the deoxynucleotides in the reaction mixture. Binary coding of the two sets of fragments in each mix is accomplished by employing different concentrations of the 2',3'-dideoxynucleoside 5'-triphosphate terminators in the two reactions. For example, in the A+G reaction, the ddA terminator is present in five-fold greater concentration than the ddG terminator. This results in ddA-terminated fragment signals that are fivefold greater than the signal generated by the ddG-terminated fragments. Thus, the ddA fragments can be distinguished from the ddG fragments in the same separation because the ddA fragments give signals, represented as peaks, that are five times the height (or five times the area) of the ddG terminated fragments. By running two sequencing reactions and two parallel separations of the fragments from these two reactions, the sequence information can be deduced from the peak order and the relative peak magnitudes. Proper alignment of the separations in two adjacent lanes of the electrophoretic gel is required to obtain accurate sequencing information.
Konrad and Pentoney (U.S. Pat. No. 5,273,638) described a similar approach in which two sets of sequencing reactions are prepared for each template DNA to be sequenced. Each reaction mixture contains three of the four possible 2',3'-dideoxynucleoside 5'-triphosphate terminator fragments, though the identity of the three terminators in the first reaction mixture (e.g., ddA, ddC, ddG) is different from the identity of the three terminators in the second reaction mixture (e.g. ddC, ddG, ddT). In addition, the relative concentrations of the three terminators in each mix are different (e.g. 4:1.7:0.7) such that the relative heights or areas of the resulting peaks in the separations of the mixtures are height- or area-coded. Thus, by comparing the relative magnitude of the heights (or areas) of the peaks in one of the separations, one can assign a nucleotide sequence to the pattern of peaks. Since peaks for one of the nucleotides will not be present in the separation, a second parallel separation of the second set of fragments is used to identify the relative positions of the omitted nucleotide of the first reaction and to resolve conflicts/anomalies arising from the peak height-encoding strategy.
Tabor and Richardson (U.S. Pat. No. 5,409,811) detailed a sequencing method utilizing a single mixture of three 2',3'-dideoxynucleoside 5'-triphosphate terminators, which differ in relative concentration in the sequencing mixture. The resulting 2',3'-dideoxynucleoside 5'-triphosphate-terminated fragments can be differentiated by the relative intensities of the bands in the separation.
Mathies et al. (U.S. Pat. No. 5,436,130) described a sequencing method using a binary coding scheme in conjunction with two fluorophore labels. The mixture of four 2',3'-dideoxynucleoside 5'-triphosphate terminated fragments is separated in a single lane or channel of electrophoresis. The individual nucleotides of the DNA are distinguished by a combination of the intensity and the spectral characteristics (ratio of color at two wavelengths due to the presence of the two fluorophores in differing ratios) of the peaks.
To obtain useful sequence data, many of the above sequencing methods require proper alignment of the separation patterns derived from the various mixtures of terminated sequencing fragments run in adjacent lanes of an electrophoretic gel. Problems with misalignment between lanes in electrophoretic separations of nucleic acid sequencing fragments can be caused by inclusion of a gas bubble in an electrophoretic gel lane, lane-to-lane variation of the temperature of the gel during separation, variation in the extent of polymerization of the gel from lane to lane, variation in the ionic composition of the samples loaded into different lanes, and variation in the depths of the wells into which the samples are loaded. Hara (U.S. Pat. No. 4,720,786) described a method of correcting for these offset distortions between adjacent lanes in an electrophoretic gel based on the resolution of bands corresponding to the smaller oligonucleotide fragments in the lower portion of the electrophoretic separations. The method relies on the assumption that because the band spacing is greater in the lower portion of the separation, the proper band migration pattern can be determined even in the presence of offset distortions in one or more of the electrophoretic lanes. Once the degree of offset distortion has been established in this portion of the gel, a correction factor can be derived and applied to other gel lanes resulting in properly ordered band patterns. Because the offset distortion is determined for bands in the lower part of the gel, and because the degree of distortion varies from the bottom to the top of the gel, the correction factor for bands in the middle and at the top of the gel must be extrapolated from data derived from bands at the bottom of the gel. The extent of offset distortion must vary in a predictable and consistent manner for this alignment protocol to be effective.
Fujii (U.S. Pat. No. 5,419,825) described an apparatus and method for sequencing which relies on calibration coefficients for time bases of respective electrophoresis lanes which are evaluated from differences between positions of signals in a range known to cause no sequence inversion. In essence, Fujii determines "correction factors" for the various electrophoretic lanes based on the migration patterns for the highest mobility (smallest) fragments in each lane, and on the assumption that no band migration order inversion has occurred for these smallest fragments. Once these correction factors have been determined, they are applied to the next group of electrophoretic bands reaching the detector to correct for any mobility differences between lanes for this next group of bands. Once corrected, these bands are used to derive new correction factors which can be applied to the next group of peaks reaching the detector.
Both correction methods rely on the assumption that the relative migration order of the highest mobility bands (smallest fragments) in each of the four electrophoretic separations that are to be aligned is correct. Both methods also rely on the assumption that changes in electrophoretic conditions during the course of the separation (e.g. temperature fluctuations) affect all four electrophoretic separations equally and thus can be compensated by calculated correction factors. Neither method could be used to align electrophoretic bands obtained in four sequential separations in which the fragments were separated on the same electrophoretic media (e.g. separated by capillary electrophoresis in a single capillary), since temporal variations in separation conditions are not likely to be reproducible in sequential separations.
In contrast, the method of the present invention makes no assumptions about the temporal migration order in any region of the four separations. The method of the instant invention makes no assumptions about the effect of fluctuations in separation conditions over the course the separation. Temporal changes in separation conditions during the electrophoresis run will be reflected in minute changes in the band spacing pattern in a given separation which may result in a local expansion or contraction of the peak spacing. This will be corrected internally in that separation but will have no impact on separation patterns for the other three bases. Thus, the invention method can be applied not only to separations run simultaneously in adjacent lanes of an electrophoretic slab gel, but can also be utilized to align separations run sequentially in the same electrophoretic capillary or run in serial or parallel fashion on two, three, or four different capillaries or slab gels.