The present invention relates to thermostable DNA polymerases which have enhanced efficiency for incorporating nucleoside triphosphates labeled with fluorescein family dyes. The present invention provides means for isolating and producing such altered polymerases. The enzymes of the invention are useful for many applications in molecular biology and are particularly advantageous for nucleic acid sequencing.
Incorporation of nucleoside triphosphates (dNTPs) labeled with fluorescent dyes is important for many in vitro DNA synthesis applications. For example, dye-terminator DNA sequencing reactions require the incorporation of fluorescent dideoxynucleotide analogues for termination and labeling. In addition, in vitro synthesis of labeled products may involve incorporation of fluorescent nucleotides or nucleotide analogues. For example, fluorescently labeled DNA has been used in hybridization assays using microarrays of immobilized probes (Cronin et al., 1996, Human Mutation 7:244).
To assure fidelity of DNA replication, DNA polymerases have a very strong bias for incorporation of their normal substrates, referred to herein as conventional deoxynucleoside triphosphates (dNTPs), and against incorporation of unconventional dNTPs including dNTPs and dNTP analogues labeled with fluorescent dyes. In the cell, this property attenuates the incorporation of abnormal bases such as dUTP in a growing DNA strand. In vitro, this characteristic is particularly evident where both conventional and unconventional fluorescently-labeled nucleoside triphosphates are present, such as in DNA sequencing reactions using a version of the dideoxy chain termination method that utilizes dye-terminators (Lee et al., 1992, Nuc. Acids. Res. 20:2471 which is incorporated herein by reference).
Commercially available DNA cycle sequencing kits for dye-terminator methods use chain terminator ddNTPs labeled with fluorescent dyes of the rhodamine family. However, rhodamine dyes are zwitterionic in charge and nucleoside triphosphates labeled with these dyes migrate anomalously in the electrophoretic gels used to separate the sequencing products for detection. This property of rhodamine family dyes necessitates making modifications in the standard sequencing protocol which include the use of dITP and an additional processing step before electrophoresis.
In contrast, negatively charged fluorescent dyes such as fluorescein family dyes allow 1) better separation between the labeled nucleoside triphosphates and labeled primer extension products, and 2) better electrophoretic migration of the labeled sequencing products than neutral or positively charged fluorescent dyes. Thus, the use of fluorescein family dyes avoids the need for additional processing steps required with the use of rhodamine family dyes. However, available dyes of the fluorescein family are not ideal for use in current commercially available DNA cycle sequencing formats because ddNTPs labeled with these dyes are not efficiently incorporated into sequencing products using these formats. Consequently, there is a need for commercially available thermostable DNA polymerases that can efficiently incorporate both conventional and fluorescein-labeled nucleotides. The present invention serves to meet that need. Further, an unexpected property of the mutant enzymes of this invention is the increased rate of primer extension relative to the corresponding wild-type enzyme. Another unexpected property is the increased uniformity of incorporation of the various terminator nucleotides in automated DNA sequence analysis.
The present invention provides template-dependent thermostable DNA polymerase enzymes having reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes compared to previously characterized enzymes. These enzymes incorporate nucleotides, including deoxynucleotides (dNTPs) and base analogues such as dideoxynucleotides (ddNTPs), that are labeled with fluorescein family dyes more efficiently than conventional thermostable enzymes. Genes encoding these enzymes are also provided by the present invention, as are recombinant expression vectors for providing large amounts of purified enzymes.
By the present invention, a region of criticality within thermostable DNA polymerases is identified which affects the polymerase""s ability to incorporate nucleotides labeled with fluorescein family dyes, while retaining the ability to incorporate faithfully natural nucleotides. This region of criticality, or Critical Motif, can be introduced into genes for thermostable DNA polymerases by recombinant DNA methods such as site-specific mutagenesis to provide the advantages of the invention.
Thus, in one aspect, the invention provides recombinant thermostable DNA polymerase enzymes which are characterized in that the enzymes have been mutated to produce the Critical Motif and have reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes, in comparison to the corresponding wild-type enzyme.
In this aspect, the invention provides recombinant thermostable DNA polymerase enzymes which are characterized in that a) in its native form said polymerase comprises the amino acid sequence (given in one-letter code) LSXXLX(V/I)PXXE (SEQ ID NO: 1), where X is any amino acid; b) the X at position 4 in said sequence is mutated in comparison to said native sequence, except that X is not mutated to E; and c) said thermostable DNA polymerase has reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes in comparison to the native form of said enzyme. In the three-letter code, this amino acid sequence is represented as LeuSerXaaXaaLeuXaaXaaProXaaXaaGlu (SEQ ID NO: 1), whereby xe2x80x9cXaaxe2x80x9d at positions 3, 4, 6, 9, and 10 of this sequence are any amino acid residue, and xe2x80x9cXaaxe2x80x9d at position 7 of this sequence is Val or Ile.
In another embodiment, the recombinant thermostable DNA polymerases are characterized in that a) the native form of the polymerase comprises the amino acid sequence LS(Q/G)XL(S/A)IPYEE (SEQ ID NO: 2), where X is any amino acid; b) the X at position 4 in said sequence is mutated in comparison to said native sequence, except that X is not mutated to E; and c) said thermostable DNA polymerase has reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes in comparison to the native form of said enzyme. In the three-letter code, this amino acid sequence is represented as LeuSerXaaXaaLeuXaalleProTyrGluGlu (SEQ ID NO: 2), whereby xe2x80x9cXaaxe2x80x9d at position 3 is Gln or Gly, xe2x80x9cXaaxe2x80x9d at position 4 is any amino acid, and xe2x80x9cXaaxe2x80x9d at position 6 is Ser or Ala. In a preferred embodiment, the amino acid sequence is LSQXLAIPYEE (SEQ ID NO:3), where X is any amino acid. In the three-letter code, this amino acid sequence is represented as LeuSerGlnXaaLeuAlaIleProTyrGluGlu (SEQ ID NO:3), whereby xe2x80x9cXaaxe2x80x9d at position 4 is any amino acid. In a more preferred embodiment, the xe2x80x9cXaaxe2x80x9d at position 4 is Lys.
In yet another embodiment, the recombinant thermostable DNA polymerases are characterized in that a) the native form of the polymerase comprises the amino acid sequence LSVXLG(V/I)PVKE (SEQ ID NO: 4); b) the X at position 4 in said sequence is mutated in comparison to said native sequence, except that X is not mutated to E; and c) said thermostable DNA polymerase has reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes in comparison to the native form of said enzyme. In the three-letter code, this amino acid sequence is represented as LeuSerValXaaLeuGlyXaaProValLysGlu (SEQ ID NO: 4), whereby xe2x80x9cXaaxe2x80x9d at position 4 is any amino acid and xe2x80x9cXaaxe2x80x9d at position 7 is Val or Ile. In a preferred embodiment, the amino acid sequence is LSVXLGVPVKE (SEQ ID NO: 5) where X at position 4 is any amino acid. In the three-letter code, this amino acid sequence is represented as LeuSerValXaaLeuGlyValProValLysGlu (SEQ ID NO: 5), whereby xe2x80x9cXaaxe2x80x9d at position 4 is any amino acid. In a more preferred embodiment, the xe2x80x9cXaaxe2x80x9d at position 4 is Arg. In another preferred embodiment, the amino acid sequence is LSVXLGIPVKE (SEQ ID NO: 6) where X at position 4 is any amino acid. In the three-letter code, this amino acid sequence is represented as LeuSerValXaaLeuGlyIleProValLysGlu (SEQ ID NO: 6), whereby xe2x80x9cXaaxe2x80x9d at position 4 is any amino acid. In a more preferred embodiment, the xe2x80x9cXaaxe2x80x9d at position 4 is Arg.
In another aspect of this invention, the particular region of criticality of this invention can be combined with motifs in other regions of the polymerase gene that are known to provide thermostable DNA polymerases with reduced discrimination against incorporation of unconventional nucleotides such as rNTPs and ddNTPs. As exemplified herein, a recombinant Thermus aquaticus (Taq) DNA polymerase enzyme containing two mutations was constructed. The first mutation was an E to K mutation in the X residue at position 4 of the critical motif of this invention. The second mutation was a mutation allowing more efficient incorporation of ddNTPs known as the F667Y mutation. This mutation is a phenylalanine to tyrosine mutation at position 667 of Taq DNA polymerase (described in US Pat. No. 5,614,365 and U.S. Ser. No. 8/448,223 abandoned and herein incorporated by reference). When used in a sequencing reation with fluorescein dye family-labeled ddNTPs, the E681K F667Y double mutant enzyme was found to produce a readable sequencing ladder. Thus, in one embodiment, a motif conferring reduced discrimination toward dideoxynucleotides is combined with the critical motif of this invention to provide an enzyme having an increased efficiency of incorporation of both labeled and unlabeled ddNTPs.
In addition, the E681K F667Y mutant enzyme was unexpectedly found to exhibit a significantly increased extension rate relative to an enzyme with the F667Y mutation alone. Thus, in another embodiment of the invention, introduction of the critical motif into a thermostable DNA polymerase enzyme, alone or in combination with other motifs, produces enzymes having an increased extension rate. The double mutant enzyme was also unexpectedly found to produce more uniform peak heights in dye-terminator dideoxy sequencing using rhodamine-labeled terminators. Thus, in yet another embodiment, introduction of the critical motif into a thermostable DNA polymerase enzyme produces enzymes displaying more uniform peak heights in DNA sequencing methods using rhodamine dye family labeled terminators.
In another embodiment, a mutation allowing more efficient incorporation of rNTPs, such as the glutamic acid to glycine mutation at position 615 of Taq DNA polymerase, or E615G mutation (described in U.S. Ser. No. 60/023,376, filed Sep. 6, 1996, and herein incorporated by reference), is combined with the critical motif of this invention to provide an enzyme having an increased efficiency of incorporation of ribonucleotides labeled with fluorescein family dyes.
In another aspect of this invention, genes encoding the polymerases of this invention are also provided. Specifically, genes encoding recombinant thermostable polymerases comprising the critical motif of this invention are provided. Also included in this aspect are genes encoding combinations of two or more mutations that include mutations producing the critical motif of this invention.
In yet another aspect, the invention also provides improved methods of DNA sequencing that allow the use of lower concentrations of fluorescein dye family-labeled ddNTPs, thereby reducing the cost of performing the reactions. The improved methods of the invention also allow the use of lower ratios of fluorescein dye family-labeled ddNTPs to dNTPs. Use of these methods results in numerous advantages, including more efficient polymerization, lower concentrations of template nucleic acid being required, and a decreased likelihood of introducing inhibitors into the reaction mix. These advantages also facilitate the sequencing of long templates. The invention also provides improved methods of sequencing wherein sequencing reactions can be loaded directly onto sequencing gels for subsequent electrophoresis without intermediate purification.
Thus, in one embodiment of the invention, the invention provides improved methods for determining the sequence of a target nucleic acid using a recombinant enzyme which has a) a mutation at position 4 which produces the critical motif of this invention and b) has reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes in comparison with the corresponding wild-type enzyme. Also within the scope of this invention are improved sequencing methods using thermostable DNA polymerase enzymes derived from thermophilic species, where the enzymes contain naturally occurring sequence variations that produce the critical motif of this invention. These native enzymes can also provide reduced discrimination against incorporation of unconventional nucleotides. In this embodiment, the invention provides improved methods of sequencing using a native thermostable DNA polymerase a) having the critical motif of this invention wherein the amino acid in position 4 is not Glu and b) having reduced discrimination against incorporation of nucleotides labeled with fluorescein family dyes.
Also within the scope of this invention are improved methods of producing DNA labeled with fluorescein family dyes. The enzymes of the invention efficiently incorporate fluorescein-labeled dNTPs in a polymerase chain reaction method, producing amplified products that are labeled at various sites with fluorescein family dyes. Thus, in one embodiment, an improved method of labeling DNA comprises a) providing a reaction mixture comprising dNTPs labeled with fluorescein family dyes and an enzyme of the invention and b) performing a nucleic acid amplification reaction.
The enzymes of the invention, and genes encoding these enzymes, provide additional aspects of the invention which are kits for DNA sequencing that comprise a recombinant enzyme of the invention and may additionally include a negatively charged fluorescent terminator compound. Other kits for DNA sequencing comprise a) a negatively charged fluorescent terminator compound and b) a native enzyme of the invention.
The invention also provides kits for producing labeled DNA which comprise a recombinant enzyme of the invention. Other kits for producing labeled DNA comprise a) a negatively charged fluorescent nucleoside triphosphate compound and b) a native enzyme of the invention.