The polymerase chain reaction (PCR) method is an excellent method, and its utilization has expanded year by year [Randall K. Saiki et al. (1988) Science 239, 487-4911. In the PCR method, even one molecule of DNA fragment can be amplified. The method for sequencing PCR amplified products without cloning them (the direct sequencing method) is also a useful method [Corinne Wong et al. (1988) Nature, 330, 384-386. This technique does not require construction of libraries and screening of such libraries, and is a quick method capable of simultaneously obtaining sequence information of many samples.
However, the above direct sequencing method suffers from two major problems.
One is that primers and 2′-deoxyribonucleoside 5′-triphosphates (2′-dNTPs) not incorporated remain in a reaction system, and the remaining substances inhibit sequencing reactions. Therefore, in conventional methods, such primers and 2′-dNTPs must be removed from PCR products before sequencing. There are many methods for purification of PCR products and examples include purification by electrophoresis, ethanol precipitation, gel filtration and HPLC purification [see, for example, Dorit R. L. et al. (1991) Current Protocols in Molecular Biology, Vol. 11, John Wiley and Sons, New York, 15.2.1-15.2.11. However, these methods are complicated without exception.
The second problem is quick renaturation of PCR products. When the PCR products are renatured into a double-stranded DNA, they are no longer single-stranded templates, and annealing between primers and single-stranded templates is inhibited. As methods for minimizing the renaturation, quenching after denaturation, biotinylation of one primer and absorption of PCR products onto streptavidin-coated articles, use of exonuclease, asymmetric PCR and the like have been reported. See, for example, Barbara Bachmann et al., 1990, Nucleic Acid Res., 18, 1309-. However, most of these methods are time—consuming and very laborious.
Therefore, the present inventors proposed an absolutely novel method for determining nucleotide sequence of DNA for solving these problems. This method does not require removal of unreacted primers and 2′-deoxyribonucleoside 5′ triphosphates (2′-dNTPs) remaining in the PCR reaction system, and does not require denaturation at all. This method eliminates the problem of quick renaturation of PCR reaction products [WO96/14434]. This method is a direct transcriptional sequencing method utilizing an RNA polymerase such as T7 RNA polymerase and a terminator for RNA transcription reaction (for example, 3′-deoxyribonucleoside 5′-triphosphates, 3′-dNTPs). According to this method, nucleotide sequences of DNA products amplified by the polymerase chain reaction can be used as they are for sequencing without removing primers and 2′-deoxyribonucleoside 5′-triphosphates (2′-dNTPs). In addition, because it does not require denaturation itself at all, it can avoid the problem of quick renaturation of PCR products, and hence is an extremely excellent method.
However, the present inventors further studied the above method, and found that it has a problem to be solved in order to obtain more accurate nucleotide sequence data.
In the above nucleotide sequence determination method, an RNA polymerase such as T7 RNA polymerase is used for the reaction in a mixture comprising ribonucleoside 5′-triphosphates including ATP, GTP, CTP, UTP and derivatives thereof, and at least one 3′-deoxyribonucleotide such as 3′-dATP, 3′-dGTP, 3′-dCTP, 3′-dUTP and derivatives thereof. In this reaction, polyribonucleotides are synthesized by sequential incorporation of ribonucleotides and deoxyribonucleotides into a ribonucleotide sequence in a manner corresponding to the sequence of templates.
However, it was found that 3′-deoxyribonucleotides and derivatives thereof are unlikely to be incorporated into the sequence rather than corresponding ribonucleotides, and the occurrence of the incorporation may also vary among the ribonucleotides and the 3′-deoxyribonucleotides depending on a base group each nucleotide has. Such biased incorporation between ribonucleotides and 3′-deoxyribonucleotides, as well as among ribonucleotides having different base groups and among deoxyribonucleotides having different base groups may result in short transcription products and fluctuation of signals from labeled ribonucleotides. Therefore, it is difficult to obtain accurate sequence data even though transcription products can be obtained.
Therefore, an object of the present invention is to provide an RNA polymerase exhibiting incorporation ability with no or little bias resulting from differences in nucleotides.
In the description of the present invention, amino acid residues are represented by the conventionally used one-letter codes. For clarification, they are specifically mentioned for only those amino acids appearing in this text as follows: phenylalanine (F), tyrosine (Y), proline (P), leucine (L), and histidine (H) . A numeral accompanied by the codes is a number counted from the N-terminus of the polymerase. For example, “F667” means that the 667th amino acid residue of this polymerase is F, and “F667Y” means that Y was substituted for F of the 667th residue.
By the way, DNA polymerases are also known to show biased incorporation resulting from difference in a base group each nucleotide has, and mutant DNA polymerases free from such biased incorporation is also known [Japanese Patent Unexamined Publication (KOKAI) No. (Hei) 8-205874/1996; and Tabor et al., Proc. Natl. Acad. Sci. USA, 92:6339-6343, (1995)].
In the aforementioned literature, it is described as follows. In the sequencing reaction utilizing T7 DNA polymerase, the 526th amino acid in the polymerase contributes to equalize nucleotide incorporation. And due to homology between T7 DNA polymerase and other DNA polymerases, the bias of incorporation of the other DNA polymerases may be reduced by replacing an amino acid residue present in their region homologous to the 526th amino acid including region in the T7 DNA polymerase. That is, Y (tyrosine) 526 of T7 DNA polymerase results in the reduced bias of efficiency for incorporation of 2′-dNTPs and 2′,3′-ddNTPs. F (phenylalanine) 762 of E. coli DNA polymerase I and F (phenylalanine) 667 of Thermus aquaticus DNA polymerase (generally called Taq DNA polymerase) are the amino acid residues corresponding to Y526 of T7 DNA polymerase and the bias of these polymerases may be reduced by substituting F762Y (tyrosine) and F667Y (tyrosine) respectively for these residues.
Further, it is also described that it was suggested that modification of a region of T7 RNA polymerase corresponding to the region discussed for DNA polymerases, i.e., the residues 631-640, may change its specificity for dNTPs.
However, RNA polymerases have not been used for sequencing methods so far, and therefore the different efficiency of ribonucleotide incorporation itself has not become a problem. Under such circumstances, any mutant RNA polymerases free from the biased incorporation have of course not been known. In fact, the aforementioned Japanese Patent Unexamined Publication (KOKAI) No. (Hei) 8-205874/1996 does not mention any specific examples of modification of T7 RNA polymerase.
The region of T7 RNA polymerase mentioned above is considered to correspond to the region consisting of 9-10 amino acid residues between amino acids K and YG in the motif B mentioned in Protein Engineering, 3:461-467, 1990, which region is particularly conserved in DNA polymerase ∝ and I, and DNA-dependent RNA polymerases (T7 RNA polymerase is classified in these polymerases). F (phenylalanine) of the amino acid residue 762 in E. coli DNA polymerase and the amino acid residue 667 in Taq DNA polymerase, previously discussed for DNA polymerases, are observed in many of DNA polymerases classified in the type I. However, it was surprisingly found that T7 RNA polymerase does not have F (phenylalanine) in the residues 631-640 corresponding to the aforementioned region, though T7 RNA polymerase is highly homologous to DNA polymerases. Therefore, the teachings of the aforementioned literatures could not be realized as described.
Further, the present inventors attempted modification of amino acids of T7 RNA polymerase in the region corresponding to the helix 0 of the finger subdomain of E. coli DNA polymerase 1, in which F762 of E. coli DNA polymerase I presents. However, F (phenylalanine) was not found also in the helix Z in T7 RNA polymerase, which is indicated in the steric structure reported in the literature of Sousa et al. (Nature, 364:593-599, 1993) and corresponds to the helix 0 of E. coli DNA polymerase I.
Under the circumstances, the present inventors originally searched for a novel RNA polymerase in order to provide an RNA polymerase which exhibits little or no bias for the incorporating ability valuable due to the kind of ribonucleotides and 3′-deoxyribonucleotides. As a result, the present invention was completed based on the findings that an RNA polymerase having an increased ability of incorporating 3′-deoxyribonucleotides and derivatives thereof can be obtained by partially modifying amino acids in a wild type RNA polymerase.
While it will be apparent from the descriptions hereinafter, the RNA polymerase of the present invention, or in particular the location of the amino acid modification thereof is not suggested nor taught at all in Japanese Patent Unexamined Publication (KOKAI) No. (Hei) 8-205874/1996, and it was absolutely originally found by the present inventors.