Approaches to DNA sequencing over the past twenty years have varied widely. The use of enzymes and chemicals is making it possible to sequence the human genome. However, this effort takes enormous resources.
Until recently, there were only two general sequencing methods available, the Maxam-Gilbert chemical degradation method (Maxam and Gilbert, 1977, Proc. Natl. Acad. Sci., USA 74:560), and the Sanger dideoxy chain termination method (Sanger et al., 1977, Proc. Natl. Acad. Sci., USA 74:5463). Using the dideoxy chain termination DNA sequencing method, DNA molecules of differing lengths are generated by enzymatic extension of a synthetic primer, using DNA polymerase and a mixture of deoxy- and dideoxy-nucleoside triphosphates. To perform this reaction, the DNA template is incubated with a mixture containing all four deoxynucleoside 5′-triphosphates (dNTPs), one or more of which is labeled with 32P, and a 2′,3′-dideoxynucleoside triphosphate analog (ddNTP). Four separate incubation mixtures are prepared, each containing a different ddNTP analog (ddATP, ddCTP, ddGTP, or ddTTP). The dideoxynucleotide analog is incorporated normally into the growing complementary DNA strand by the DNA polymerase, through their 5′ triphosphate groups.
However, because of the absence of a 3′-OH group on the ddNTP, phosphodiester bonds cannot be formed with the next incoming dNTPs. This results in termination of the growing complementary DNA chain. Therefore, at the end of the incubation period, each reaction mixture contains a population of DNA molecules having a common 5′ terminus, but varying in length to a nucleotide base specific 3′ terminus. These four preparations, with heterogeneous fragments each ending in either cytosine (C), guanine (G), adenine (A) or thymine (T) are separated in four parallel lanes on polyacrylamide gels. The sequence is determined after autoradiography, by determining the terminal nucleotide base at each incremental cleavage in the molecular weight of the electrophoresed fragments.
The Maxam-Gilbert method of DNA sequencing involves the chemical-specific cleavage of DNA. In this method, radio-labeled DNA molecules are incubated in four separate reaction mixtures, each of which partially cleaves the DNA at one or two nucleotides of a specific identity (G, A+G, C or C+T). The resulting DNA fragments are separated by polyacrylamide gel electrophoresis, with each of the four reactions fractionated in a separate lane of the gel. The DNA sequence is determined after autoradiography, again by observing the macromolecular separation of the fragments in the four lanes of the gel.
The use of fluorescent nucleotides has eliminated the need for radioactive nucleotides, and provided a means to automate DNA sequencing. As fluorescent DNA fragments on an electrophoresis gel pass by a detector, the sequential fluorescent signals (which correspond to a fragment ending in a particular nucleotide) are automatically converted into the DNA sequence, eliminating the additional step of exposing the gel to film. Improvements on this general concept have been the subject of several U.S. patents, including U.S. Pat. No. 5,124,247 to Ansorge, U.S. Pat. No. 5,242,796 to Prober et al., U.S. Pat. No. 5,306,618 to Prober et al., U.S. Pat. No. 5,360,523 to Middendorf et al., U.S. Pat. No. 5,556,790 to Pettit, and U.S. Pat. No. 5,821,058 to Smith et al. However, the methods disclosed in these patents still require the inconvenient step of separating the generated DNA fragments by size, using electrophoresis.
There are several disadvantages associated with using electrophoresis for nucleic acid sequencing. Electrophoresis requires macroscopic separation, with the necessity of expensive reagents, long gel preparation time, tedious sample loading, the dangers of exposure to the neurotoxin acrylamide. Macromolecular electrophoretic separation also exposes the technician to high voltage devices, requires prolonged electrophoresis time, produces gel artifacts, and requires calculations to adjust for dye mobilities. Furthermore, sequencing runs only allow for the sequencing of less than 1000 bases at a time, which can be a substantial drawback to the sequencing of long stretches of the genome.
Given the practical drawbacks of electrophoresis, attempts have been made to eliminate this step. Mills, for example, described the use of mass spectrometry to separate the DNA fragments as an alternative to electrophoresis (U.S. Pat. Nos. 5,221,518 and 5,064,754). However, mass spectrometry devices are expensive, and because the method depends on size separation, it has a size resolution limit.
Others have attempted to separate nucleic acid sequences by size using capillary electrophoresis (Karger, Nucl. Acids Res. 19:4955-62, 1991). In this method, fused silica capillaries filled with polyacrylamide gel are used as an alternative to slab gel electrophoresis. However, this method is limited by the separation process and requires very high detection sensitivity and wavelength selectivity due to the small sample size.
Melamede (U.S. Pat. No. 4,863,849) and Cheeseman (U.S. Pat. No. 5,302,509) describe DNA sequencing methods which require a complex external liquid pumping system to add and remove necessary reagents. In these “open” systems, which contain the polymerase and the DNA to be sequenced, fluorescent nucleotides are pumped into a reaction chamber and added to the DNA molecule. After the incorporation of a single nucleotide, unincorporated fluorescent dNTPs are removed, leaving behind the DNA and its newly incorporated fluorescent nucleotide. This incorporated nucleotide is detected, its signal converted into a DNA sequence, and the process is repeated until the sequencing is complete. Although these methods can eliminate the electrophoresis step, the addition of nucleotides must be monitored one at a time as they are added to a population of DNA molecules, by continually pumping materials in and out of the reaction chamber.
In another automated process, Jett et al. (U.S. Pat. Nos. 4,962,037 and 5,405,747) uses an exonuclease to sequentially shorten a DNA molecule that is being sequenced. After a complementary DNA strand is synthesized in the presence of fluorescent nucleotides, the exonuclease cleaves individual fluorescent nucleotides from the end of the synthesized DNA molecule. These nucleotides pass through a detector, and the fluorescent signal emitted by each nucleotide is recorded to determine the DNA sequence.
In the methods of Melamede (U.S. Pat. No. 4,863,849) and Cheeseman (U.S. Pat. No. 5,302,509) described above, the addition or release of nucleotides from several DNA molecules is monitored simultaneously. This is sequencing at the macromolecular level, as opposed to sequencing at the molecular level, which involves monitoring the addition or release of nucleotides from a single DNA molecule. A disadvantage of macromolecular sequencing methods is that even though all of the DNA molecules start with identical nucleotides, they may quickly evolve into a mixed population. When using the macromolecular methods, some chains may more efficiently incorporate nucleotides than others, and some DNA may be degraded more slowly or rapidly than others.
To solve this synchronization problem, Jett et al. (U.S. Pat. No. 4,962,037) and Ulmer (U.S. Pat. No. 5,674,743) developed molecular level sequencing systems in which a single fluorescently labeled DNA base is sequentially cleaved from a DNA molecule. The fluorescent signal from each cleaved dNTP is used to determine the DNA sequence. One drawback to these methods, however, is that the DNA molecule which is being sequenced must be held in a stream, which often results in shearing of the DNA, especially at higher flow rates. The sheared DNA molecule can not be accurately sequenced. In addition, only one DNA molecule can be sequenced at a time by this method.
The development of fluorescence resonance energy transfer (FRET) labels for DNA sequencing has been described by Ju (U.S. Pat. No. 5,814,454) and Mathies et al. (U.S. Pat. No. 5,707,804). During FRET, exciting the donor dye with light of a first wavelength releases light of a second wavelength, which in turn excites the acceptor dye(s) to emit light of a third wavelength, which is then detected. These patents disclose the attachment of FRET labels to oligonucleotide primers for sequencing DNA molecules. A drawback of these methods is that there is still a need for size separation (for example using electrophoresis) prior to determining the DNA sequence.
Therefore, there remains a need for a method of sequencing nucleic acids at the molecular scale, that does not require the use of electrophoresis or complex liquid pumping systems, and does not result in the shearing of nucleic acids. In addition, methods that are automated would be particularly useful.