1. Field of the Invention
This invention relates to nucleotide analogs and their derivatives (termed non-standard nucleotides) that, when incorporated into DNA and RNA, expand the number of replicatable nucleotides beyond the four found in standard DNA and RNA. The invention further relates to processes that incorporate those non-standard nucleotide analogs into oligonucleotide products using the corresponding triphosphate derivatives, and more specifically, polymerases and non-standard nucleoside triphosphates that support the polymerase chain reaction (PCR) reaction with these, including PCR where the products contain more than one non-standard nucleotide.
2. Description of the Related Art
Natural oligonucleotides bind to complementary oligonucleotides according to the well-known rules of nucleobase pairing first elaborated by Watson and Crick in 1953, where adenine (A) pairs with thymine (T) (or uracil, U, in RNA), and guanine (G) pairs with cytosine (C), with the complementary strands anti-parallel to each other. These rules arise from two principles of complementarity, size-complementarity (large purines pair with small pyrimidines) and hydrogen bonding complementarity (hydrogen bond donors pair with hydrogen bond acceptors).
It is now well established in the art that the number of independently replicable nucleotides in DNA can be increased, where the size- and hydrogen binding complementarities are retained, but where different heterocycles (nucleobase analogs) attached to the sugar-phosphate backbone implement different hydrogen bonding patterns. As many as eight different nucleobase analogs forming four additional nucleobase pairs are conceivable (see, for example, [Benner, S. A. (1995) Non-standard Base Pairs with Novel Hydrogen Bonding Patterns. U.S. Pat. No. 5,432,272 (Jul. 11, 1995)]). This has led to an “artificially expanded genetic information system” (AEGIS). The ability of pairing between the additional nucleobase pairs to support DNA duplex stability has had substantial use in diagnostics. In this disclosure, DNA includes oligonucleotides containing AEGIS nucleic acids and their analogs in linear and non-linear topologies, including as dendrimers, comb-structures, and nanostructures, and these oligonucleotides and their analogs carrying tags (e.g., fluorescent, functionalized, or binding) to the ends, sugars, or nucleobases.
It would be useful to amplify oligonucleotides containing AEGIS components in processes analogous to the well-known polymerase chain reaction (PCR), here defined as a process involving thermal cycling, where the heat step denatures a duplex formed at each cycle to allow a new set of primers to bind. If PCR could be implemented with expanded DNA AEGIS alphabets, it would have many uses, including (without limitation) DNA and RNA-targeted diagnostics, and in vitro selection and evolution to create catalysts, ligands, and receptors.
Various items in the art describe efforts to use the U.S. Pat. No. 5,432,272 nucleobases with polymerases to support PCR. However, these generally failed to sustain PCR over more than five heat-cool cycles, since polymerases that incorporate non-standard base pairs into duplexes with sufficient efficiency and fidelity to support PCR were not described. This failure is illustrated by Johnson et al. [Johnson, S. C., Sherrill, C. B., Marshall, D. J., Moser, M. J., Prudent, J. R. (2004) A third base pair for the polymerase chain reaction: inserting isoC and isoG. Nucl. Acids Res. 32, 1937-1941], who attempted to incorporate the isocytosine and isoguanine disclosed in U.S. Pat. No. 5,432,272 into PCR. As their publication shows, the non-standard component is not retained in the product, to an extent greater than 90% over 5 cycles. Indeed, their FIG. 2 showed that only ˜90% of the isoC:isoG pair remained after just one cycle, and only ˜80% was retained after seven cycles. This can be used as a metric for the utility of a PCR process that incorporates a non-standard nucleobase pair. In this case, the loss was attributed to the ability of a minor tautomeric form of isoguanosine to pair with thymidine, as well as contacts that thermostable polymerases (the kind that are needed for useful PCR, as they survive heating to at least 80° C. for the purpose of separating strands) make to unshared electrons in the minor groove, which are delivered by DNA from the exocyclic C═O groups of C and T, and N3 of A and G.
Many enzymes work well with AEGIS components, including kinases, ligases, and even ribosomes [Bain, J. D., Chamberlin, A. R., Switzer, C. Y., Benner, S. A. (1992) Ribosome-mediated incorporation of non-standard amino acids into a peptide through expansion of the genetic code. Nature 356, 537-539]. Polymerases, in contrast, accept many non-standard components of DNA only inefficiently, judging by rate, processivity, fidelity, or some combination of these [Horlacher, J., Hottiger, M., Podust, V. N., Hübscher, U. and Benner, S. A. (1995) Expanding the genetic alphabet: Recognition by viral and cellular DNA polymerases of nucleosides bearing bases with non-standard hydrogen bonding patterns, Proc. Natl. Acad. Sci. 92, 6329-6333]. These inefficiencies need not prevent the utility of polymerase-based incorporation of AEGIS components in single pass experiments, and may not be apparent with standing start experiments, where the non-standard triphosphate is the first nucleotide to be added to a primer, or a running start experiment, where the polymerase adds standard nucleotides before it is challenged to incorporate a non-standard nucleotide. However, they defeat sustained amplification by PCR where over 90% of the nucleobase is retained after the first theoretical cycle, here defined as “useful PCR”.
Thus, U.S. Pat. No. 5,432,272 nor the prior art do not enable useful PCR of DNA containing non-standard nucleotides (AEGIS components). While it is recognized by those of ordinary skill in the art, and taught here, that PCR processes invariably introduce some mutations, and that some daughter oligonucleotides will not have the exact identical sequence as the original oligonucleotide (and indeed, sequence evolution due to this infidelity is useful for doing in vitro evolution, see U.S. Pat. No. 8,586,303), PCR amplification of these oligonucleotides would be most useful if the level of mutation is lower rather than higher, preferably less than a 5% loss of the non-standard nucleobase per cycle, and more preferably less than a 2% loss of the non-standard nucleobase per cycle, and in any case retaining 90% of the AEGIS component after the first cycle.
U.S. Pat. No. 8,354,225 (Ser. No. 11/371,497) attempted to achieve a less ambitious process, here with an extra nucleotide pair formed between diaminopyrimidine and either xanthosine or 5-azo-7-deazaxanthosine, one that did not involve thermocycling. This was shown to be possible with a mutant form of the reverse transcriptase from HIV. Unfortunately, reverse transcriptases are not thermally stable upon heating to 80° C. (or, in most cases, even above 50° C.), and therefore cannot support PCR. Indeed, U.S. Pat. No. 8,354,225 required an addition of more reverse transcriptase after each heat step. Further, the other pyrimidine nucleoside analogs that U.S. Pat. No. 8,354,225 disclosed had nucleobases based on a pyrazine ring system, now known to epimerize rapidly. Finally, the structure disclosed by U.S. Pat. No. 8,354,225 to implement the purine analog with a hydrogen bond donor-donor-acceptor pattern is now known to be nonfunctional, and the pyrimidine analog shown to implement the hydrogen bond donor-donor-acceptor pattern lacks a methyl group and is now known to be unstable with respect to depyrimidinylation. FIG. 1 summarizes these deficiencies.
For these reasons, despite the widespread recognition of the value of PCR using non-standard nucleobases, if it could be achieved, many in the art considered this goal unachievable.