This invention relates to DNA polymerases suitable for DNA sequencing and to automated and manual methods for DNA sequencing.
The following is a brief description of art relevant to DNA sequencing techniques. This is provided only to give general guidance to those reading the application, and is not an admission that any art cited herein or referred to explicitly or implicitly is prior art to the appended claims.
DNA sequencing generally involves the generation of four populations of single-stranded DNA fragments having one defined terminus and one variable terminus. The variable terminus generally terminates at specific nucleotide bases (either guanine (G), adenine (A), thymine (T), or cytosine (C)). The four different sets of fragments are each separated on the basis of their length, one procedure being on a high resolution polyacrylamide gel; each band on the gel corresponds colinearly to a specific nucleotide in the DNA sequence, thus identifying the positions in the sequence of the given nucleotide base. See Tabor and Richardson, U.S. Pat. Nos. 4,942,130 and 4,962,020.
There are two general methods of DNA sequencing. One method (Maxam and Gilbert sequencing) involves the chemical degradation of isolated DNA fragments, each labeled with a single radiolabel at its defined terminus, each reaction yielding a limited cleavage specifically at one or more of the four bases (G, A, T or C). The other method (dideoxy or chain-termination sequencing) involves the enzymatic synthesis of a DNA strand. Sanger et al. (Proc. Nat. Acad. Sci. USA 74:5463, 1977). Four separate syntheses are generally run, each reaction being caused to terminate at a specific base (G, A, T or C) via incorporation of an appropriate chain terminating nucleotide, such as a dideoxynucleotide. The latter method is preferred since the DNA fragments can be uniformly labelled (instead of end labelled) by the inclusion of a radioactively labeled nucleoside triphosphate and thus the larger DNA fragments contain increasingly more radioactivity. Further, .sup.35 S-labelled nucleotides can be used in place of .sup.32 P-labelled nucleotides, resulting in sharper definition; the reaction products are easier to interpret since each lane corresponds only to either G, A, T or C. The enzymes used for most dideoxy sequencing include T7 DNA polymerase and DNA polymerases isolated from thermophilic organisms such as Taq, Vent, Tth, and others. Other polymerases used to a lesser extent include AMV reverse transcriptase and Klenow fragment of E. coli DNA polymerase I.
In the dideoxy chain terminating method a short single-stranded primer is annealed to a single-stranded template. The primer is elongated at its 3'-end by the incorporation of deoxynucleotides (dNMPs) until a dideoxynucleotide (ddNMP) is incorporated. When a ddNMP is incorporated elongation ceases at that base. Other chain terminating agents can be used in place of a ddNTP and the ddNTP can be labelled as discussed below.
Using the above methodology, automated systems for DNA sequence analysis have been developed. One instrument, which was manufactured by EG&G, makes use of conventional dideoxy chain terminating reactions with a radioactively labeled nucleotide. The resulting DNA products are separated by gel electrophoresis. Toneguzzo et al, 6 Biotechniques 460, 1988. A detector scans for radioactivity as it passes through the bottom of the gel. Four synthesis reactions are required for each template to be sequenced, as well as four lanes on each gel, a separate lane being used for products terminated by each specific chain terminating agent.
Kambara et al, 6 Biotechnology 816, 1988, have used a fluorescent-labelled primer. The resulting fluorescently labelled products are excited with a laser at the bottom of the gel and the fluorescence detected with a CRT monitor. This procedure also requires four synthesis reactions and four lanes on the gel for each template to be sequenced.
Applied Biosystems manufactures an instrument in which four different primers are used, each labelled with a different fluorescent marker. Smith et al., 13 Nuc. Acid. Res. 2399, 1985; and 321 Nature 674, 1986. Each primer is used in a separate reaction containing one of four dideoxynucleotides. After the four reactions have been carried out the mixtures are combined and the DNA fragments are fractionated in a single lane on a gel. A laser at the bottom of the gel is used to detect fluorescent products after they have been electrophoresed through the gel. This system requires four separate annealing reactions and four separate synthesis reactions for each template, but only a single lane on the gel. Computer analysis of the sequence is made easier by having all four bands in a single lane.
DuPont used to provide an instrument in which a different fluorescent marker was attached to each of four dideoxynucleoside triphosphates. Prober et al., 238 Science 336, 1987. A single annealing step, a single polymerase reaction (containing each of the four labelled dideoxynucleoside triphosphates) and a single lane in the sequencing gel are required. The four different fluorescent markers in the DNA products are detected separately as they are electrophoresed through the gel.
Englert et al., U.S. Pat. No. 4,707,235 (1987), describes a multichannel electrophoresis apparatus having a detection means, disposed substantially across the whole width of the gel, that can detect labelled DNA products as they migrate past the detector means in four separate lanes, and identifies the channel or lane in which the sample is located. Preferably, radioisotopic labels are used.
Inherent to procedures currently used for DNA sequence analysis is the necessity to separate either radioactively or fluorescently-labelled DNA products by a gel permeation procedure such as polyacrylamide gel electrophoresis, and then detect their locations relative to one another along the axis of movement through the gel. The accuracy of this procedure is determined in part by the uniformity of the signal in bands which have permeated approximately the same distance through the gel. Differences or variations in signal intensities between nearby bands create several problems. First, they decrease the sensitivity of the method, which is limited by the ability to detect the bands containing the weakest signals. Second, they create difficulties in determining whether a band with a weak signal is a true signal due to the incorporation of a chain terminating agent, or an artifact due to a pause site in the DNA where the polymerase has dissociated. Third, they decrease the accuracy in determining the DNA sequence between closely spaced bands since the strong signal of one band may mask the weak signal of its neighbor. See Tabor and Richardson, supra.
Variation in band intensity can arise from an inherent property of most DNA polymerases. Most DNA polymerases discriminate against the chain terminating dideoynucleotides used in DNA sequence analysis. T4 DNA polymerase discriminates against ddNTPs to such an extent that it cannot be used for DNA sequencing. E. coli DNA polymerase I, Taq, and Vent DNA polymerase also discriminate strongly against ddNTPs, each incorporating a ddNMP a thousand times slower than the corresponding dNTP. Tabor and Richardson supra, (both hereby incorporated by reference herein) have shown that T7 DNA polymerase lies at the other end of the spectrum, discriminating against ddNTPs only several fold. If a DNA polymerase discriminated against a ddNTP to the same extent at all sequences, this problem could be overcome by simply altering the ratio of ddNTPs to dNTPs. Such an approach has been used with E. coli DNA polymerase I and Taq DNA polymerase. However, the extent of discrimination varies with the adjacent DNA sequences, which leads to wide variation in the intensity of adjacent radioactive fragments. The intensity of specific fragments can vary by 50-fold for E. coli DNA polymerase I but only several fold for T7 DNA polymerase. Consequently, the intensity of bands on a DNA sequencing gel produced by T7 DNA polymerase are of similar intensity thus facilitating their detection and analysis by automated procedures. In addition, procedures that even further reduce the discrimination against dideoxynucleotides by T7 DNA polymerase are described such that it incorporates dideoxynucleotides equally as well as deoxynucleotides. These procedures and conditions also reduce but do not eliminate discrimination by other DNA polymerases such as Klenow and Taq DNA polymerases. For example, the use of manganese in place of, or in addition to, magnesium in the reaction mixture may reduce or eliminate discrimination against dideoxynucleotides. Under such conditions, T7 DNA polymerase does not differentiate between the two molecules whereas other DNA polymerases such as Klenow fragment, Taq and Vent still discriminate to some degree. For example, Klenow still discriminates against ddNTPs by as much as four-fold in the presence of manganese. More important, even though the overall degree of discrimination by such enzymes as Klenow and Taq DNA polymerases is reduced, the intensity of specific fragments can vary by much more than four fold due to high discrimination at certain sequences in DNA. These polymerases and procedures are now almost universally used in manual DNA sequencing (i.e., without aid of sequencing machines such as described above) and are extensively used in automated methods. The use of manganese and the lack of discrimination against ddNTPs at all sites results in bands of uniform intensities, thus facilitating the reading of sequencing gels, either by manual or automated procedures. Moreover, the lack of discrimination enables the use of novel procedures for sequence analysis (Tabor and Richardson, supra). A method based on this finding is provided to determine a DNA sequence in a single reaction that contains all four ddNTPs at different ratios, by measuring the relative intensity of each peak after gel electrophoresis. The authors indicate:
The DNA polymerases of this invention do not discriminate significantly between dideoxy-nucleotide analogs and normal nucleotides. That is, the chance of incorporation of an analog is approximately the same as that of a normal nucleotide or at least incorporates the analog with at least 1/10 the efficiency that of a normal analog. The polymerases of this invention also do not discriminate significantly against some other analogs. This is important since, in addition to the four normal deoxynucleoside triphosphates (dGTP, dATP, dTTP and dCTP), sequencing reactions require the incorporation of other types of nucleotide derivatives such as: radioactively- or fluorescently-labelled nucleoside triphosphates, usually for labeling the synthesized strands with .sup.35 S, .sup.32 P, or other chemical agents. When a DNA polymerase does not discriminate against analogs the same probability will exist for the incorporation of an analog as for a normal nucleotide. For labelled nucleoside triphosphates this is important in order to efficiently label the synthesized DNA strands using a minimum of radioactivity. PA1 The ability to produce nearby bands of approximately the same intensity is useful since it permits the results of any sequencing reaction to be read more easily and with greater certainty. Further, since the DNA products from a sequencing reaction with a specific chain terminating agent form bands which are of approximately the same intensity as that of nearby bands, band intensity itself provides a specific label for the series of bands so formed. The number of DNA products of approximately the same molecular weight produced by a given chain terminating agent varies depending upon the concentration of the chain terminating agent. Thus, by using a different concentration of each of the four chain terminating agents for the synthesis the DNA products incorporating one chain terminating agent are distinguished from DNA products of approximately the same molecular weight incorporating other chain terminating agents in that they differ in number or amount; consequently, the bands of DNA products can be identified as to chain terminating agent simply by their intensity as compared to the intensities of nearby bands. As a result, two or more series of DNA products, each series having a different chain terminating agent, can be subjected to gel permeation in a single lane and identified, i.e., distinguished from each other, by the intensity of each band as compared to the intensity of nearby bands. Moreover, the syntheses of DNA products incorporating different chain terminating agents need not be carried out separately, in separate containers, but may all be carried out simultaneously in a single reaction vessel, and the same label, e.g., radioisotopic, fluorescent, etc. can, if desired, be used for all chain terminating agents instead of a different label for each, thus simplifying the procedure.
They also state:
See also Tabor and Richardson Proc. Natl. Acad. Sci. USA 86, 4076-4080 (1989) which indicates that substitution of manganese ions for magnesium ions for catalysis by T7 DNA polymerase or E. coli DNA polymerase reduces the discrimination of these polymerases for ddNTPs by 4-100 fold, and Tabor and Richardson J. Biol. Chem. 265, 8322-8328 (1990) which describes the use of pyrophosphatase and manganese ions to generate dideoxy-terminated fragments of uniform intensity using T7 DNA polymerase.