1. Field of the Invention
The present invention relates to compositions and methods for altering the fidelity of nucleic acid synthesis.
More particularly, the present invention relates to the following general areas: (1) nucleotide triphosphate monomers having at least one molecular or atomic tag bonded to and/or chemically and/or physically associated with one or more of the phosphate groups of the triphosphate moiety of the monomers, the base moiety, and/or the sugar moiety in the case of a nucleoside analog; (2) methods for enzymatic DNA synthesis with altered fidelity; (3) methods of sequencing DNA, based on the detection of base incorporation using tags bonded to and/or chemically and/or physically associated with the β and/or γ phosphates of the triphosphate of the nucleotide monomer, the base moiety of a nucleotide or nucleoside monomer, and/or the sugar moiety of a nucleotide or nucleoside monomer, the polymerase or by the release of the tagged pyrophosphate (PPi); (4) a template-mediated primer extension reaction with improved monomer incorporation fidelity using the tagged monomers; (5) methods for performing a primer extension reaction, such as a DNA sequencing reaction, or a polymerase chain reaction using the tagged monomers; (6) methods for improving nucleotide incorporation fidelity by adding tagged pyrophosphate (PPi) to a monomer polymerization medium, where the monomers can be tagged or untagged; and (7) kits for conducting nucleotide sequencing, a polymerase chain reaction, a templated-mediated primer extension reaction or similar reaction with improved monomer incorporation fidelity using either tagged pyrophosphate and/or untagged or tagged monomers.
2. Description of the Related Art
Sequencing Nucleic Acids Using Tagged Monomers
The primary sequences of nucleic acids are crucial for understanding the function and control of genes and for applying many of the basic techniques of molecular biology. The ability to do rapid and reliable DNA sequencing is, therefore, a very important technology. The DNA sequence is an important tool in genomic analysis as well as other applications, such as genetic identification, forensic analysis, genetic counseling, medical diagnostics, etc. With respect to the area of medical diagnostic sequencing, disorders, susceptibilities to disorders, and prognoses of disease conditions, can be correlated with the presence of particular DNA sequences, or the degree of variation (or mutation) in DNA sequences, at one or more genetic loci. Examples of such phenomena include human leukocyte antigen (HLA) typing, cystic fibrosis, tumor progression and heterogeneity, p53 proto-oncogene mutations and ras proto-oncogene mutations. See, e.g., Gyllensten et al., PCR Methods and Applications, 1: 91–98 (1991); U.S. Pat. No. 5,578,443, issued to Santamaria et al., incorporated herein by reference; and U.S. Pat. No.5,776,677, issued to Tsui et al., incorporated herein by reference.
Various approaches to DNA sequencing exist. The dideoxy chain termination method serves as the basis for all currently available automated DNA sequencing machines. See, e.g., Sanger et al., Proc. Natl. Acad. Sci., 74: 5463–5467 (1977); Church et al., Science, 240: 185–188 (1988); and Hunkapiller et al., Science, 254: 59–67 (1991)). Other methods include the chemical degradation method, see, e.g., Maxam et al., Proc. Natl. Acad. Sci., 74: 560–564 (1977); whole-genome approaches see, e.g., Fleischmann et al., Science, 269,496 (1995); expressed sequence tag sequencing see, e.g., Velculescu et al., Science, 270, (1995); array methods based on sequencing by hybridization, see, e.g., Koster et al., Nature Biotechnology, 14,1123 (1996); and single molecule sequencing (SMS), see, e.g., Jett et al., J. Biomol. Struct. Dyn. 7,301 (1989), Schecker et al., Proc. SPIE-Int. Soc. Opt. Eng. 2386,4 (1995), and Hardin et al. U.S. pat. appln. Ser. No. 09/901,782, filed Jul. 9, 2001, incorporated herein by reference.
Fluorescent dyes can be used in a variety of these DNA sequencing techniques. A fluorophore moiety or dye is a molecule capable of generating a fluorescence signal. A quencher moiety is a molecule capable of absorbing the energy of an excited fluorophore, thereby quenching the fluorescence signal that would otherwise be released from the excited fluorophore. In order for a quencher to quench an excited fluorophore, the quencher moiety must be within a minimum quenching distance of the excited fluorophore moiety at some time prior to the fluorophore releasing the stored fluorescence energy.
Fluorophore-quencher pairs have been incorporated into oligonucleotide probes in order to monitor biological events based on the fluorophore and quencher being separated or brought within a minimum quenching distance of each other. For example, probes have been developed wherein the intensity of the fluorescence increases due to the separation of the fluorophore-quencher pair. Probes have also been developed which lose their fluorescence because the quencher is brought into proximity with the fluorophore.
These fluorophore-quencher pairs have been used to monitor hybridization assays and nucleic acid amplification reactions, especially polymerase chain reactions (PCR), by monitoring either the appearance or disappearance of the fluorescence signal generated by the fluorophore molecule.
The decreased fluorescence of a fluorophore moiety by collision or direct interaction with a quencher is due mainly to a transfer of energy from the fluorophore in the excited state to the quencher. The extent of quenching depends on the concentration of quencher and is described by the Stem-Volmer relationship:F0/F=1+KSV[Q]wherein F0 and F correspond to the fluorescence in the absence and presence of quencher, respectively, and [Q] is the quencher concentration. A plot of F0/F versus [Q] yields a straight line with a slope corresponding to the Stern-Volmer constant, KSV. The foregoing equation takes into account the dynamic and collisional quenching which is the dominant component of the quenching reaction. A linear S-V plot can be obtained when the quenching is completely due to a dynamic (or collisional) process or a static complex formation. A non-linear plot will occur when both static and collisional quenching are occurring simultaneously (see, A. M. Garcia, Methods in Enzymology, 207,501–511 (1992)).
In general, fluorophore moieties preferably have a high quantum yield and a large extinction coefficient so that the dye can be used to detect small quantities of the component being detected. Fluorophore moieties preferably have a large Stokes shift (i.e., the difference between the wavelength at which the dye has maximum absorbance and the wavelength at which the dye has maximum emission) so that the fluorescent emission is readily distinguished from the light source used to excite the dye.
One class of fluorescent dyes which has been developed is the energy transfer fluorescent dyes. For instance, U.S. Pat. Nos. 5,800,996, and 5,863,727, issued to Lee et al., disclose donor and acceptor energy fluorescent dyes and linkers useful for DNA sequencing, incorporated therein by reference. Other fluorophore-quencher pairs are disclosed in PCT Application Serial No. PCT/US99/29584, incorporated herein by reference. In energy transfer fluorescent dyes, the acceptor molecule is a fluorophore which is excited at the wavelength of light corresponding to the fluorescence emission the excited donor molecule. When excited, the donor dye transmits its energy to the acceptor dye.
Therefore, emission from the donor is partially or totally quenched due to partial or total energy transfer from the excited donor to the acceptor dye, resulting in the excitation of the latter for emission at its characteristic wavelength (i. e., a wavelength different from that of the donor dye which may represent a different color if the emissions are in the visible portion of the spectrum). The advantage of this mechanism is twofold; the emission from the acceptor dye is more intense than that from the donor dye alone when the acceptor has a higher fluorescence quantum yield than the donor (see, Li et al., Bioconjugate Chem., 10: 242–245, (1999)) and attachment of acceptor dyes with differing emission spectra allows differentiation among molecules by fluorescence using a single excitation wavelength.
Nucleotide triphosphates having a fluorophore moiety attached to the γ-phosphate are of interest as this modification still allows the modified NTPs to be enzyme substrates. For instance, Felicia et al., describe the synthesis and spectral properties of a “always-on” fluorescent ATP analog, adenosine-5′-triphosphoroyl-(5-sulfonic acid)naphthyl ethylamindate (γ-1,5-EDANS) ATP. Yarbrough et al. 1978, JBC. The analog is a good substrate for E. coli RNA polymerase and can be used to initiate the RNA chain. The ATP analog is incorporated into the RNA synthesized and is a good probe for studies of nucleotide-protein interactions, active site mapping and other ATP-utilizing biological systems. See, e.g., Felicia et al., Arch. Biochem Biophys., 246: 564–571 (1986).
In addition, Sato et al., disclose a homogeneous enzyme assay that uses a fluorophore moiety (bimane) attached to the γ-phosphate group of the nucleotide and a quencher moiety attached to the 5-position of uracil. The quencher moiety is in the form of a halogen, bound to the C-5 position of the pyrimidine. The quenching that is effected by this combination is eliminated by cleavage of the phosphate bond by the phosphodiesterase enzyme. The halogen quencher used in the assay is very inefficient producing only about a two fold decrease in fluorescent efficiency.
Template-Mediated Primer Extension Reaction
In a template-mediated primer extension reaction, an oligonucleotide primer having homology to a single-stranded template nucleic acid is caused to anneal to a template nucleic acid, the annealed mixture is then provided with a DNA polymerase in the presence of nucleoside triphosphates under conditions in which the DNA polymerase extends the primer to form a complementary strand to the template nucleic acid. In a Sanger-type DNA sequencing reaction, the primer is extended in the presence of a chain-terminating agent, e.g., a dideoxynucleoside triphosphate, to cause base-specific termination of the primer extension (Sanger). In a polymerase chain reaction, two primers are provided, each having homology to opposite strands of a double-stranded DNA molecule. After the primers are extended, they are separated from their templates, and additional primers caused to anneal to the templates and the extended primers. The additional primers are then extended. The steps of separating, annealing, and extending are repeated in order to geometrically amplify the number of copies of the template nucleic acid (Saiki).
In both DNA sequencing and PCR, it is critically important that the primer extension product accurately replicate the nucleotide sequence of the template nucleic acid. However, under certain conditions, peak “dropout” has been observed wherein certain nucleotides are not represented in the primer extension product. This problem is believed to be caused by pyrophosphorolysis of the primer extension product by a reverse nucleotide addition reaction promoted by the accumulation of pyrophosphates in the reaction mixture. See Mullis; Tabor 1990; Tabor 1996.
Pyrophosphate Effects on Nucleic Acid Synthesis and/or Sequencing
It has been recognized that pyrophosphorolysis, where an oligonucleotide is reduced in length, is detrimental to primer extension reactions. The pyrophosphorolysis is caused by the availability of pyrophosphate. For example, PCR is inhibited by the addition of pyrophosphate even at very low concentrations. According to U.S. Pat. No. 5,498,523, this pyrophosphorolysis can be prevented by providing an agent, for example, a pyrophosphatase, capable of removing pyrophosphate. Addition of pyrophosphatase to a PCR greatly enhances the progress of the reaction and provides superior results compared to the reaction without a pyrophosphatase. See U.S. Pat. No. 4,800,159, incorporated herein by reference.
Similarly, the addition of a pyrophosphatase to a sequencing reaction provides more uniformity in intensities of bands formed in a polyacrylamide gel used to identify products of the sequencing reaction. This uniformity is due to prevention of degradation of specific DNA products by pyrophosphorolysis. See also, Tabor, S. and Richardson, C. C., J. Biol. Chem. 265:8322 (1990) and U.S. Pat. No. 4,962,020, incorporated herein by reference.
Each product or band in a dideoxy sequencing experiment is a polynucleotide complementary to the template and terminated at the 3′ end in a base-specific manner with a dideoxynucleotide. The dideoxy stabilizes the product, preventing further polymerization of the polynucleotide. However, in certain regions of the template, the bands, especially after prolonged reaction, will reduce in intensity or completely disappear (“drop-out” bands). In certain sequence contexts, the PPi contained within the enzyme is thought to remain there for an extended period of time. A drop-out may not be readily detected by the operator, leading to errors in the interpretation of the data either by a human or computer-driven analyzer. Since this phenomenon is stimulated by inorganic pyrophosphate, the effect is presumably due to pyrophosphorolysis (reverse polymerization), not 3′-exonucleolytic activity. It is hypothesized that DNA polymerase idling at the end of these terminated products and in the presence of sufficient pyrophosphate will remove the dideoxynucleotide, then extend from the now free 3′-hydroxyl end to another dideoxy termination. In effect, the bands are converted to longer polynucleotides bands. Removal of pyrophosphate as it is generated in the polymerization reaction eliminates this problem.
Sequencing by Direct Detection of Released Tagged Pyrophosphate
Researchers have used a series of enzyme reactions coupled to pyrophosphate generation to measure DNA polymerase activity. In the first (P. Nyren, Anal. Biochem. 167:235 (1987)), Nyren used ATP: sulfate adenylyltransferase to convert pyrophosphate and adenosine 5′-phosphosulfate to ATP and sulfate ion. The ATP was used to make light with luciferase. In the second (J. C. Johnson et al., Anal. Biochem. 26:137 (1968)), the researchers reacted the pyrophosphate with UDP-glucose in the presence of UTP: glucose-1-phosphate uridylyltransferase to produce UTP and glucose-1-phosphate. In two more steps, polymerase activity was measured spectrophotometrically by the conversion of NADP to NADPH. While these articles describe the use of ATP: sulfate adenylyltransferase and UTP: glucose-1-phosphate uridylyltransferase in measuring DNA polymerase activity, they do not describe their use to prevent or inhibit pyrophosphorolysis in nucleic acid synthesis reactions.
DNA sequencing is an essential tool in molecular genetic analysis. The ability to determine DNA nucleotide sequences has become increasingly important as efforts have commenced to determine the sequences of the large genomes of humans and other higher organisms.
The two most commonly used methods for DNA sequencing are the enzymatic chain-termination method of Sanger and the chemical cleavage technique of Maxam and Gilbert.
Both methods rely on gel electrophoresis to resolve, according to their size, DNA fragments produced from a larger DNA segment. Since the electrophoresis step as well as the subsequent detection of the separated DNA fragments are cumbersome procedures, a great effort has been made to automate these steps. However, despite the fact that automated electrophoresis units are commercially available, electrophoresis is not well suited for large-scale genome projects or clinical sequencing where relatively cost-effective units with high throughput are needed. Thus, the need for nonelectrophoretic methods for sequencing is great and several alternative strategies have been described, such as scanning tunnel electron microscopy (Driscoll et al. 1990, Nature, 346,294–296), sequencing by hybridization (Bains et al., 1988, J. Theo. Biol. 135, 308–307) and single molecule detection (Jeff et al., 1989, Biomol. Struct. Dynamics, 7, 301–306), to overcome the disadvantages of electrophoresis.
Techniques enabling the rapid detection of a single DNA base change are also important tools for genetic analysis. In many cases detection of a single base or a few bases would be a great help in genetic analysis since several genetic diseases and certain cancers are related to minor mutations. A mini-sequencing protocol based on a solid phase principle was described (Hultman, et al., 1988, Nucl. Acid. Res., 17, 4937–4946; Syvanen et al., 1990, Genomics, 8, 684–692). The incorporation of a radio labeled nucleotide was measured and used for analysis of the three-allelic polymorphism of the human apolipoprotein E gene. However, radioactive methods are not well suited for routine clinical applications and hence the development of a simple non-radioactive method for rapid DNA sequence analysis has also been of interest.
Methods of sequencing based on the concept of detecting inorganic pyrophosphate (PPi) which is released during a polymerase reaction have been described (WO 93/23564 and WO 89/09283). As each nucleotide is added to a growing nucleic acid strand during a polymerase reaction, a pyrophosphate molecule is released. It has been found that pyrophosphate released under these conditions can be detected enzymically e.g. by the generation of light in the luciferase-luciferin reaction. Such methods enable a base to be identified in a target position and DNA to be sequenced simply and rapidly whilst avoiding the need for electrophoresis and the use of harmful radio labels. See for example U.S. Pat. No. 5,498,523, incorporated herein by reference.
However, the PPi-based sequencing methods mentioned above are not without drawbacks. The template must be washed thoroughly between each nucleotide addition to remove all non-incorporated deoxynucleotides. This makes it difficult to sequence a template which is not bound to a solid support. In addition new enzymes must be added with each addition of deoxynucleotide.
Thus, there is a need for improved methods of sequencing which allow rapid detection, have increase fidelity and provision of sequence information and which are simple and quick to perform, lending themselves readily to automation.