Natural oligonucleotides bind to complementary oligonucleotides according to the well-known rules of nucleobase pairing first elaborated by Watson and Crick in 1953, where adenine (A) pairs with thymine (T) (or uracil, U, in RNA), and guanine (G) pairs with cytosine (C), with the complementary strands anti-parallel to one another. In this disclosure, “DNA” or “nucleic acid” is understood to include, as appropriate, both DNA (where the sugar is 2′-deoxyribose) and RNA (where the sugar is ribose), as well as derivatives where the sugar is modified, as in 2′—O-methyl, 2′-O-allyl, 2′-deoxy-2′-fluoro, and 2′,3′-dideoxynucleoside derivatives, nucleic acid analogs based on other sugar backbones, such as threose, locked nucleic acid derivatives, bicyclo sugars, or hexose, glycerol and glycol sugars [Zhang, L., Peritz, A., Meggers, E. (2005) A simple glycol nucleic acid. J. Am. Chem. Soc. 127, 4174-4175], nucleic acid analogs based on non-ionic backbones, such as “peptide nucleic acids”, these nucleic acids and their analogs in non-linear topologies, including as dendrimers, comb-structures, and nanostructures, and these nucleic acids and their analogs carrying tags (e.g., fluorescent, functionalized, or binding) to the ends, sugars, or nucleobases.
These pairing rules allow for the specific hybridization of an oligonucleotide to a complementary oligonucleotide, making oligonucleotides valuable as probes in the laboratory, in diagnostic applications, as messages that can direct the synthesis of specific proteins, and in a wide range of other applications well known in the art. Such base pairing is used, for examples and without limitation, to capture other oligonucleotides to beads, arrays, and other solid supports, to allow nucleic acids to fold in hairpins, beacons, and catalysts, as supports for functionality, such as fluorescence, fluorescence quenching, binding/capture tags, and catalytic functionality, as part of more complex architectures, including dendrimers and nanostructures, and as scaffolds to guide chemical reactions.
Further, nucleobase pairing is the basis by which enzymes are able to catalyze the synthesis of new oligonucleotides that are complementary to template nucleotides. In this synthesis, building blocks (normally the triphosphates of ribo- or deoxyribonucleosides carrying of A, T, U, C, or G) are directed by a template oligonucleotide to form a complementary oligonucleotide with the complementary sequence. This process is the basis for replication of all forms of life, and also serves as the basis for technologies for enzymatic synthesis and amplification of specific heterosequence nucleic acids by enzymes such as DNA and RNA polymerase, in the polymerase chain reaction (PCR), and in a variety of architectures that may involve synthesis, ligation, cleavage, immobilization and release, inter alia, used in technology to detect nucleic acids.
Nucleobase pairing following rules of complementarity is known to be useful in a variety of architectures. In solution, nucleobase pairing in the loop of a molecular beacon can open the beacon, separating a fluorescent species attached to one end of a hairpin structure from a quencher on the other. Pairing can assemble two DNA fragments transiently or covalently, as in a template-directed ligation. Pairing is useful for affixing an oligonucleotide that is free in solution to a support carrying the complementary oligonucleotide. The oligonucleotide can carry functional groups, including fluorescent groups attached to the nucleobases.
The Watson-Crick pairing rules can be understood chemically in terms of the arrangement of hydrogen bonding groups on the heterocyclic nucleobases of the oligonucleotide, groups that can either be hydrogen bond donors or acceptors. In the standard Watson-Crick geometry, a large purine nucleobase pairs with a small pyrimidine nucleobase. Thus, the AT nucleobase pair is the same size as a GC nucleobase pair. This means that the rungs of the DNA ladder, formed from either AT or GC nucleobase pairs, all have the same length.
Further recognition between nucleobases is determined by hydrogen bonds between the nucleobases. In standard nucleobases, hydrogen bond donors are heteroatoms (nitrogen or oxygen in the natural nucleobases) bearing a hydrogen; hydrogen bond acceptors are heteroatoms (nitrogen or oxygen in the natural nucleobases) with a lone pair of electrons. In the geometry of the Watson-Crick nucleobase pair, a six membered ring (in standard nucleobases, a pyrimidine) is juxtaposed to a ring system composed of a fused six membered ring and a five membered ring (in standard nucleobases, a purine), with a middle hydrogen bond linking two ring atoms, and hydrogen bonds on either side joining functional groups appended to each of the rings, with donor groups paired with acceptor groups.
In many applications, the nucleobases incorporated into one or more oligonucleotide analogs carry an appendage. In standard nucleobases, the appendage, or side chain, is attached to one or more pyrimidines at the 5-position, or at the 7-position of a 7-deazapurine, or to an exocyclic nitrogen, most often the exocyclic amino group of adenine or cytosine. Such nucleoside analogs have application because of their combination of Watson-Crick nucleobase pairing ability and the properties or reactivities associated with species appended via the side chain. For example, oligonucleotides containing a T to which is appended a side chain bearing a biotin residue can first bind to a complementary oligonucleotide, and the hybrid can then be isolated by virtue of the specific affinity of biotin to avidin [Langer, P. R.; Waldrop, A. A.; Ward, D. C. (1981) Proc. Nat. Aced. Sci. 78, 6633-6637]. This finds application in diagnostic work. Instead of biotin, the side chain can carry a fluorescent moiety, or a moiety that quenches the fluorescence of another moiety, a branching point, or a moiety that complexes to a metal, or a moiety that confers catalytic activity on the oligonucleotide, or a moiety that assists in the attachment of the oligonucleotide analog to a solid support, such as a bead, a one dimensional array, or a two dimensional array.
Often, derivatized standard nucleotides can be incorporated into oligonucleotides by enzymatic transcription of natural oligonucleotide templates in the presence of the triphosphate of the derivatized nucleoside, the substrate of the appropriate (DNA or RNA) polymerase, or a reverse transcriptase. In this process, a natural nucleoside is placed in the template, and standard Watson-Crick nucleobase pairing is exploited to direct the incoming modified nucleoside opposite to it in the growing oligonucleotide chain.
The standard available nucleobase pairs are limited in that they make available only two mutually exclusive hydrogen bonding patterns. This means that should one wish to introduce a modified nucleoside based on one of the natural nucleosides into an oligonucleotide, it would be incorporated wherever the complementary natural nucleoside is found in the template. For many applications, this is undesirable.
Further, in many applications, it would be desirable to have nucleobase pairs that behave as predictably as the AT (or U) and GC nucleobase pairs, but that do not cross-pair with natural oligonucleotides, which are built from A, T (or U), G, and C. This is especially true in diagnostics assays based. Biological samples generally contain many nucleic acid molecules in addition to the nucleic acid that one wishes to detect. The adventitious DNA/RNA, often present in abundance over the targeted analyte DNA (or RNA), is also composed of A, T (or U), G, and C. Thus, adventitious DNA/RNA can compete with the desired interactions between two or more oligonucleotide-like molecules.
Many of the limitations that arise from the existence of only four standard nucleobases, joined in only two types of nucleobase pairs via only two types of hydrogen bonding schemes, could be overcome were additional nucleobases available that could be incorporated into oligonucleotides. Here, the additional nucleobases would still pair in the Watson-Crick geometry, but would present patterns of hydrogen bond donating and accepting groups in a pattern different from those presented by the natural nucleobases. They therefore would form nucleobase pairs with additional complementary nucleobases in preference to (and, preferably, with strong preference to, meaning with at least a 10 to 100 fold affinity greater than to mismatched oligonucleotides or oligonucleotide analogs). In the last decade, Benner disclosed compositions of matter that were intended to overcome the limitations of molecular recognition by changing the pattern of hydrogen bond donor and acceptor groups presented by a nucleobase to the nucleobase on a complementary oligonucleotide analog [U.S. Pat. Nos. 5,432,272, 5,965,364, 6,001,983, 6,037,120, 6,140,496, 6,627,456, 6,617,106]. These disclosures showed that the geometry of the Watson-Crick nucleobase pair can accommodate as many as 12 nucleobases forming 6 mutually exclusive pairs. Of these, four nucleobases forming two pairs are “standard”, while eight nucleobases forming four pairs were termed “non-standard”. Adding the non-standard nucleobases to the standard nucleobases yielded an Artificially Expanded Genetic Information System (AEGIS). Specifically, the structures shown in FIG. 1, taken from U.S. Pat. No. 6,140,496, implement the designated hydrogen bonding patterns. It was also noted that these nucleobases analogs might be functionalized to enable a single biopolymer capable of both genetics and catalysis. Expanded genetic alphabets have now been further explored in a variety of laboratories, and the possibility of a fully artificial genetic system has been advanced [Switzer, C. Y., Moroney, S. E., Benner, S. A. (1989) Enzymatic incorporation of a new base pair into DNA and RNA. J. Am. Chem. Soc. 111, 8322-8323][Piccirilli, J. A., Krauch, T., Moroney, S. E., Benner, S. A. (1990) Extending the genetic alphabet. Enzymatic incorporation of a new base pair into DNA and RNA. Nature 343, 33-37][Piccirilli, J. A., Krauch, I., MacPherson, L. J., Benner, S. A. (1991) A direct route to 3-(ribofuranosyl)-pyridine nucleosides. Helv. Chim. Acta 74, 397-406] [Voegel, J. J., Altorfer, M. M., Benner, S. A. (1993) The donor-acceptor-acceptor purine analog. Transformation of 5-aza-7-deaza-isoguanine to 2′-deoxy-5-aza-7-deaza-iso-guanosine using purine nucleoside phosphorylase. Helv. Chim Acta 76, 2061-2069] [Voegel, J. J., von Krosigk, U., Benner, S. A. (1993) Synthesis and tautomeric equilibrium of 6-amino-5-benzyl-3-methylpyrazin-2-one. An acceptor-donor-donor nucleoside base analog. J. Org. Chem. 58, 7542-7547][Heeb, N. V., Benner, S. A. (1994) Guanosine derivatives bearing an N2-3-imidazolepropionic acid. Tetrahedron Lett. 35, 3045-3048] [Voegel, J. J., Benner, S. A. (1994) Non-standard hydrogen bonding in duplex oligonucleotides. The base pair between an acceptor-donor-donor pyrimidine analog and a donor-acceptor-acceptor purine analog. J. Am. Chem. Soc. 116, 6929-6930][von Krosigk, U., Benner, S. A. (1995) pH-independent triple helix formation by an oligonucleotide containing a pyrazine donor-donor-acceptor base. J. Am. Chem. Soc. 117, 5361-5362][Voegel, J. J., Benner, S. A. (1996) Synthesis, molecular recognition and enzymology of oligonucleotides containing the non-standard base pair between 5-aza-7-deaza-iso-guanine and 6-amino-3-methylpyrazin-2-one, a donor-acceptor-acceptor purine analog and an acceptor-donor-donor pyrimidine analog. Helv. Chim. Acta 79, 1881-1898] [Voegel, J. J., Benner, S. A. (1996) Synthesis and characterization of non-standard nucleosides and nucleotides bearing the acceptor-donor-donor pyrimidine analog 6-amino-3-methylpyrazin-2-one. Helv. Chim. Acta 79, 1863-1880][Kodra, J., Benner, S. A. (1997) Synthesis of an N-alkyl derivative of 2′-deoxyisoguanosine. Syn. Lett., 939-940] [Jurczyk, S., Kodra, J. T., Rozzell, J. D., Jr., Benner, S. A., Battersby, T. R. (1998) Synthesis of oligonucleotides containing 2′-deoxyisoguanosine and 2′-deoxy-5-methyliso-cytidine using phosphoramidite chemistry. Helv. Chim. Acta 81, 793-811][Lutz, S., Burgstaller, P., Benner, S. A. (1999) An in vitro screening technique for polymerases that can incorporate modified nucleotides. Pseudouridine as a substrate for thermostable polymerases. Nucl. Acids Res. 27, 2792-2798][Jurczyk, S. C., Battersby, T. R., Kodra, J. T., Park, J.-H., Benner, S. A. (1999) Synthesis of 2′-deoxyisoguanosine triphosphate and 2′-deoxy-5-methyl-isocytidine triphosphate. Helv. Chim. Acta. 82, 1005-1015] [Jurczyk, S. C., Horlacher, J., Devine, K. G., Benner, S. A., Battersby, I. R. (2000) Synthesis and characterization of oligonucleotides containing 2′-deoxyxanthosine using phosphoramidite chemistry. Helv. Chim. Acta 83, 1517-1524][Rao, P., Benner, S. A. (2001) A fluorescent charge-neutral analog of xanthosine: Synthesis of a 2′-deoxyribonucleoside bearing a 5-aza-7-deazaxanthine base. J. Org. Chem. 66, 5012-50151.
To systematize the nomenclature for the hydrogen bonding patterns, the hydrogen bonding pattern implemented on a small component of a nucleobase pair pare designated by the prefix “py”. Following this prefix is the order, from the major groove to the minor groove, of hydrogen bond acceptor (A) and donor (D) groups. Thus, both thymine and uracil implement the standard hydrogen bonding pattern pyADA. The standard nucleobase cytosine implements the standard hydrogen bonding pattern pyDAA. Hydrogen bonding patterns implemented on the large component of the nucleobase pair are designated by the prefix “pu”. Again following the prefix, the hydrogen bond donor and acceptor groups are designated, from the major to the minor grooves, using “A” and “D”. Thus, the standard nucleobases adenine and guanine implement the standard hydrogen bonding patterns puDA- and puADD respectively.
A central teaching of this disclosure is that hydrogen bonding pattern designated using this systematic nomenclature is distinct, in concept, from the organic molecule that is used to implement the hydrogen bonding pattern. Thus, guanosine is a nucleoside that implements the puADD hydrogen bonding pattern. So does, however, 7-deazaguanosine, 3-deazaguanosine, 3,7-dideazaguanosine, and any of any number of other purines and purine derivatives, including those that carry side chains to which are appended functional groups, such as fluorescent, fluorescent quencher, attachment, or metal complexing groups. Which organic molecule is chosen to implement a specific hydrogen bonding pattern determines, in large part, the utility of the non-standard hydrogen bonding pattern, in various applications to which it might be applied.
The structures disclosed by U.S. Pat. No. 6,140,496, as well as its predecessor patents, provide for an expanded molecular recognition system by providing more than four independently recognizable building blocks that can be incorporated into DNA and RNA.
Should the additional nucleobase pairs be placed into DNA and RNA, and if once so placed they have the desirable pairing properties, chemical stability, and other features known to those skilled in they art, they could be useful for a variety of purposes. For example, RNA molecules prepared by transcription, although it is known to be a catalyst under special circumstances [Cech, T. R.; Bass, B. L (1986). Ann. Rev. Biochem. 55, 599][Szostak, J. W. (1986) Nature 332, 83. Been, M. D.; Cech, T. R. (1988) Science 239, 1412], appear to have a much smaller catalytic potential than proteins because they lack building blocks bearing functional groups. Conversely, the limited functionality present on natural oligonucleotides constrains the chemist attempting to design catalytically active RNA molecules, in particular, RNA molecules that catalyze the template-directed polymerization of RNA.
Likewise, additional nucleobase pairs can be incorporated enzymatically at specific positions in an oligonucleotide molecule [Switzer, C. Y, Moroney, S. E., Benner, S. A. (1989) J. Am. Chem. Soc. 111, 8322]. If functionalized, such additional nucleobases should also allow the incorporation of functional groups into specific positions in a DNA or RNA sequence. A polymerase chain reaction has been demonstrated using a variant of an HIV reverse transcriptase to incorporate the pair between 2,4-diamino-5-(1′-beta-D-2′-deoxyribofuranosyl)-pyrimidine, implementing the pyDAD hydrogen bonding pattern, and 3,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-1H-purine-2,6-dione, implementing the puADA hydrogen bonding pattern [Sismour, A. M., Lutz, S., Park, J.-H., Lutz, M. J., Boyer, P. L., Hughes, S. H., Benner, S. A. (2004) PCR amplification of DNA containing non-standard base pairs by variants of reverse transcriptase from human immunodeficiency virus-1. Nucl. Acids. Res. 32, 728-735]. As standard nucleobases bearing functional groups at the 5-position of the uridine ring are accepted as substrates for most polymerases [Leary, J. L., Brigati, D. J., Ward, D. C. (1983) Proc. Natl. Acad. Sci. 80, 4045], non-standard nucleobases that are modified at the analogous positions are also accepted, provided that the polymerase accepts the parent non-standard nucleobase. New nucleobase pairs should also find use in studies of the structure of biologically important RNA and DNA molecules [Chen, T. R., Churchill, M. E. A. Tullius, T. D. Kallenbach, N. R., Seemann, N. C. (1988) Biochem. 27, 6032] and protein-nucleic acid interactions. They should also be useful in assembling nanostructures, including branched DNA useful for diagnostics, or for nanomachines. Further, non-standard nucleobases can be used to expand the genetic code, increasing the number of amino acids that can be incorporated translationally into proteins [Bain, J. D., Chamberlin, A. R., Switzer, C. Y., Benner, S. A. (1992) Ribosome-mediated incorporation of non-standard amino acids into a peptide through expansion of the genetic code. Nature 356, 537-539].
Some commercial applications have already been realized with the expanded genetic information systems disclosed by Benner in his patents. For example, the nucleobase pair between 2-amino-5-methyl]-(1′-beta-D-2′-deoxyribofuranosyl)-4(1H)-pyrimidinone, also known as 2′-deoxyisocytidine, disoC, or sometimes (less correctly) isoC and implementing the pyAAD hydrogen bonding pattern, and 6-amino-1,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-3H-purin-2-one, also known as 2′-deoxyisoguanosine, disoG, or sometimes (less correctly) isoG, and implementing the puDDA hydrogen bonding pattern, is incorporated into the branched DNA diagnostics tools marketed today by Bayer. Here, it provides molecular recognition on demand in aqueous solution, similar to nucleic acids but with a coding system that is orthogonal to the system in DNA and RNA. Thus, it prevents the assembly of the branched dendrimer in the assay from being inhibited by adventitious nucleic acid, and prevents adventitious nucleic acid from capturing signaling elements form the nanostructure in the absence of the target analyte nucleic acid, creating noise. Further, adding extra letters to the genetic alphabet speeds hybridization, presumably because it decreases the number of close mismatches where DNA dwells before finding its correct, fully matched partner. The branched DNA assay now has FDA-approval, and is widely used to provide personalized patient care in the clinic.
The Benner patents claimed a wide range of structures generally, but only a few specifically. The compounds specifically claimed, where those claims were supported by specific examples in the disclosure, were disclosed as the preferred implementations of the individual hydrogen bonding patterns, and are reproduced in FIG. 1 (taken from FIG. 2 of U.S. Pat. No. 6,140,496). Making reference to U.S. Pat. No. 6,140,496, the following implementations (where a systematic name is given for the 2′-deoxyribonucleoside; the corresponding ribonucleosides, 2′-O-methyl ribonucleosides, and various derivatives of these were also disclosed) were preferred as implementations for each of the hydrogen bonding patterns:    For the pyDAD hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 2,4-diaminopyrimidine heterocycle. The specific deoxyribonucleoside was 2,4-diamino-5-(1′-beta-D-2′-deoxyribofuranosyl)-pyrimidine, also named (1R)-1,4-anhydro-2-deoxy-1-C-(2,4-diamino-5-pyrimidinyl)-D-erythropentitol.    For the puADA hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the xanthine heterocycle. The specific deoxyribonucleoside was 3,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-1H-purine-2,6-dione, also known as 9-(2′-deoxy-beta-D-ribosyl)-xanthine.    For the pyAAD hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 5-methyl-isocytosine heterocycle. The specific deoxyribonucleoside was 2-amino-5-methyl-1-(1′-beta-D-2′-deoxyribofuranosyl)-4(1H)-pyrimidinone,    For the puDDA hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the isoguanine heterocycle. The specific deoxyribonucleoside was 6-amino-1,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-3H-purin-2-one.    For the pyDDA hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 6-amino-5-methyl-2(1H)-pyrazinone heterocycle. The specific deoxyribonucleoside was 6-amino-5-methyl-3-(1′-beta-D-2′-deoxyribofuranosyl)-2(1H)-pyrazinone.    For the puAAD hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 5-aza-3,7-dideazaguanosine heterocycle. The specific deoxyribonucleoside was 2-amino-1,9-dihydro-5-aza-3,7-dideaza-9-(1′-beta-D-2′-deoxyribofuranosyl)-1H-purin-6-one, also known as 7-amino-9-(1′-beta-D-2′-deoxyribofuranosyl)-imidazo[1,2-c]pyrimidin-5(1H)-one,    For the pyADD hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 6-amino-3-methyl-2(1H)-pyrazinone heterocycle. The specific deoxyribonucleoside was 6-amino-3-methyl-5-(1′-beta-D-2′-deoxyribofuranosyl)-2(1H)-pyrazinone,    For the puDAA hydrogen bonding pattern. The preferred embodiment disclosed in U.S. Pat. No. 6,140,496 supported the pyDAD hydrogen bonding pattern on the 4-amino-1,3,5-triazin-2(8H)-one heterocycle. The specific deoxyribonucleoside was 4-amino-8-(2-deoxy-beta-D-erythro-pentofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-2(8H)-one, also known as, 4-amino-8-(1′-beta-D-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-2(8H)-one.
Despite the value of the compositions disclosed by U.S. Pat. No. 6,140,496, it is clear that the specific compositions used to implement the various non-standard hydrogen bonding patterns were not optimal, at least from the perspective of potential utility. Several problematic physical and chemical properties of the compositions that were claimed specifically were disclosed in the specification of U.S. Pat. No. 6,140,496.
For example, the nucleobases that were, in U.S. Pat. No. 6,140,496, specifically disclosed as implementations of the pyADD and pyDDA hydrogen bonding patterns undergo an epimerization reaction that interconverts the beta and alpha anomers [von Krosigk, U., Benner, S. A. (1995) pH-independent triple helix formation by an oligonucleotide containing a pyrazine donor-donor-acceptor base. J. Am. Chem. Soc. 117, 5361-5362] [Vogel, J. J., von Krosigk, U. Benner, S. A. (1993) Synthesis and tautomeric equilibrium of 6-amino-5-benzyl-3-methylpyrazin-2-one. An acceptor-donor-donor nucleoside base analog. J. Org. Chem. 58, 7542-7547]. This is illustrated in FIG. 2.
It was noted that this epimerization diminished the utility of these nucleobases. U.S. Pat. No. 6,140,496 and its predecessors proposed to solve the epimerization problem by replacing the furanose ring system (which includes an oxygen in a ring) with a carbocyclic cyclopentane derivative (which does not, and therefore cannot epimerize). The carbocyclic nucleoside analog is, however, difficult to synthesize, and has other disadvantages, and has never been incorporated into a commercial product.
An alternative tactic proposed to manage the epimerization problem has the pyrazine heterocycles that were the preferred implementations of the pyDDA and pyADD hydrogen bonding implementations (respectively) attached to a ribose derivative where a lower alkyl, most preferably methyl, group is attached to the 2′-oxygen. The 2′-O-alkyl group is large, and it was proposed that although the undesired epimerization reaction interconverting the beta and alpha anomers would still occur, steric factors would cause the beta (desired) form to predominate at equilibrium. Again, this would create problems if multiple non-standard nucleobases implementing this hydrogen bonding pattern were incorporated into an oligonucleotide analog.
The specification of U.S. Pat. No. 6,140,496 and its predecessors, as well as the literature, disclose difficulties with the use of 6-amino-1,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-3H-purin-2-one (isoguanosine, or isoG) as the implementation of the puDDA hydrogen bonding pattern. In its major keto form, isoguanosine implements the desired puDDA hydrogen bonding pattern. Isoguanosine has long been known to exist, to about 10% of the total in water, in a minor enolic tautomeric form. The enolic tautomer presents the puDAD hydrogen bonding pattern that is complementary to the thymidine and uridine nucleobases. That is, about 10% of isoguanine presents the puDAD hydrogen bonding pattern, not the desired puDDA pattern. This was noted in this specification to inconvenience efforts to use polymerases to copy DNA molecules containing isoguanine-containing nucleotide units. Indeed, some polymerases prefer to place thymidine (T) and/or uridine (U), rather than isocytidine (isoC), opposite isoguanosine in a template. The disutility of this was recently shown by Johnson et al. [Johnson, S. C., Sherrill, C. B., Marshall, D. J., Moser, M. J., Prudent, J. R. (2004) A third base pair for the polymerase chain reaction: inserting isoC and isoG. Nucl. Acids Res. 32, 1937-1941], who attempted to do a polymerase chain reaction amplification of a DNA molecule, requiring the repeated copying of the isoguanine-isocytosine nucleobase pair implementing the puDDA-pyAAD hydrogen bonding patterns. As expected from the known tautomeric behavior of isoguanine, the isoG-isoC pair was lost during the PCR reaction, presumably due to mismatching between T and the minor tautomer of isoguanosine.
Other features of the compounds that were specifically disclosed in U.S. Pat. No. 6,140,496 and its predecessors as the preferred implementations of the various hydrogen bonding schemes narrow the scope of their utility. For example, the heterocycle of 3,9-dihydro-9-(1′-beta-D-2′-deoxyribofuranosyl)-1H-purine-2,6-dione heterocycle (xanthine) proposed to implement the puADA hydrogen bonding pattern, is an acid, having a pKa between 5 and 6. Thus, at neutral pH and higher, where many polymerases operate and where many applications of oligonucleotide analog recognition are desired, xanthine is deprotonated. Deprotonation creates a negative charge, which destabilizes the duplex structure [Geyer, C. R., Battersby, T. R., Benner, S. A. (2003) Nucleobase pairing in expanded Watson-Crick like genetic information systems. The nucleobases. Structure 11, 1485-1498]. It is considered unlikely that multiple xanthines in an oligonucleotide analog would support rule-based molecular recognition effectively.
Likewise, the 2,4-diamino-5-(1′-beta-D-2′-deoxyribofuranosyl)-pyrimidine proposed to implement the pyDAD hydrogen bonding pattern carries a positive charge at pH 7.0, as it is a relatively good base. Further, the synthesis of the 2′-deoxyribonucleoside bearing this nucleobase is long and expensive.
Likewise, the specifically disclosed nucleoside analogs that implement the puAAD hydrogen bonding pattern on the 5-aza-3,7-dideazaguanosine heterocycle may be poor substrates for many DNA and RNA polymerases, especially those that make contact to an unshared pair of electrons in the minor groove [Steitz, T. in Burnett, R. M. and Vogel, H. J. (eds.) Biological Organization: Macromolecular Interactions at High Resolution; Academic Press: New York, 1987, pp. 45-55.]. This limits the utility of triphosphates of nucleoside analogs bearing this heterocycle as a substrate for a DNA polymerase, an RNA polymerase, and reverse transcriptases, as well as the utility of oligonucleotide analogs carrying this heterocycle as templates for these enzymes. This is also the case for derivatives which attach an alkyl group to the N-3 of the purine or purine analogs (in the analogous positions). In U.S. Pat. No. 6,140,496 and its predecessors, various N-3 methylated purines are disclosed as implementations of various hydrogen bonding patterns.
Likewise, the implementation of the pyAAD hydrogen bonding pattern using 5-alkylisocytidinc derivatives proves to present difficulties. Deoxyribosides bearing the 2-amino-5-methyl-1-(1′-beta-D-2′-deoxyribofuranosyl)-4(1H)-pyrimidinone (also know as 5-methylisocytosine) is sensitive to depyrimidinylation, the cleavage of the 1-1′ nitrogen-carbon bond to separate the heterocycle from the sugar, under acidic conditions. Considerable effort was devoted to developing the delicate synthetic procedures needed to prepare oligonucleotide analogs that contain multiple 2-deoxyisocytidines, increasing the expense of the synthesis. The acid sensitivity extends to the oligonucleotides in solution, diminishing their utility.
One purpose of the instant disclosure is to provide nucleobase analogs (where “nucleobase” refers to the heterocycle, or aglycone) that implement non-standard hydrogen bonding patterns, said analogs having properties improved over those analogs of the prior art that implement their respective hydrogen bonding patterns. In particular, the compositions of the instant invention mitigate or avoid entirely the limitations listed above of the compositions that were disclosed in U.S. Pat. No. 6,140,496.
One of ordinary skill in the art would find these improved properties unexpected, even in the light of the disclosures in patents and other literature of the prior art, and that find unexpected the greater utility that these nucleobase analogs have compared to the compositions disclosed in the prior art to implement this hydrogen bonding pattern.
Another purpose of the instant disclosure is to provide nucleoside analogs (where “nucleoside analog” is an analog of the heterocycle together with the sugar or sugar analog) that carry the nonstandard nucleobase analog, where the sugar is 2′-deoxyribose or ribose, as well as analogs where the sugar is modified, as in 2′-O-methyl, 2′-O-allyl, 2′-deoxy-2′-fluoro, and 2′,3′-dideoxynucleoside derivatives, as well as nucleoside analogs based on other sugar backbones, such as threose, locked nucleic acid derivatives, bicyclo sugars, or hexose, glycerol and glycol sugars [Zhang, L., Peritz, A., Meggers, E. (2005) A simple glycol nucleic acid. J. Am. Chem. Soc. 127, 4174-4175].
Another purpose of the instant invention is to provide oligonucleotide analogs that incorporate one or more of the nonstandard nucleoside analogs. These include nucleic acid analogs that incorporate the sugars and sugar analogs mentioned in the previous paragraph, as well as oligonucleotide analogs based on non-ionic backbones, such as “peptide nucleic acids”.
Another purpose of the instant invention is to provide nucleoside analogs in protected form that are suitable as precursors for the non-enzymatic synthesis of the non-standard oligonucleotide analogs.
Another purpose of the instant invention is to provide various phosphorylated derivatives of the stated nucleoside analogs, including triphosphates, which have utility in various enzymatic processes for the synthesis of the oligonucleotide analogs stated above.
Another purpose of the instant invention is to provide derivatives of the nucleoside analogs stated above that are degradation products of the oligonucleotide analogs stated above, and therefore help (for example) analyze these.
Another purpose of the instant invention is to provide 2′,3′-dideoxy analogs of the nucleoside analogs mentioned above, 3′-ONH2 derivatives, and other analogs and derivatives useful for the purpose of sequencing the oligonucleotide analogs mentioned above.
Another purpose of the instant invention is to provide compositions of matter wherein the oligonucleotide analogs mentioned above are attached to a solid phase, including a bead or microsphere, a two dimensional surface as part of a two dimensional array, and in a one dimensional array.
Another purpose of the instant invention is to provide processes for synthesizing said oligonucleotide analogs, both through template-directed polymerization and non-template-directed polymerization.
Another purpose of the instant invention is to provide processes for utilizing the compositions of matter described above. These include a variety of architectures that exploit a process that binds the stated oligonucleotide analogs to complementary oligonucleotide analogs containing one or more nucleobases that implements the complementary non-standard hydrogen bonding pattern, following an expanded set of Watson-Crick rules involving 6, 8, 10, and 12 letter DNA/RNA alphabets. These architectures include (without limitation) the stated oligonucleotide analogs as parts of compositions of matter that are beacons, nanostructures, dendrimers, and branched DNA molecules, and attached to solid supports such as beads, one dimensional arrays, two dimensional arrays, polonies, standard gels, and thermoresponsive gels, or in solution.
Another purpose of the instant invention is to provide oligonucleotide analogs as mentioned above for use in various architectures for detecting and sequencing oligonucleotides and oligonucleotide analogs, including within molecular beacons, in one and two dimensional arrays, on beads, in dendrimers that include both branched DNA and dendrimeric structures incorporating non-nucleosidic branching units, in assays involving cleavage reactions, in taggants and taggant detection schemes, and in nanostructures.
Another purpose of the instant invention is to provide the processes for utilization of the above described oligonucleotide analogs in the architectures above.
Another purpose of the instant invention is to provide functionalized derivatives of the nucleoside analogs mentioned above, carrying appendages that are fluorescent or that quench fluorescence, that assist in immobilization, that provide metal coordination sites, and that catalyze reactions, inter alia, when incorporated into the oligonucleotide analogs mentioned above, and into the processes mentioned above.
Another purpose of the instant invention is to provide processes for the repeated copying of the stated oligonucleotide analogs using template-directed polymerization, and copying of the copies in a polymerase chain reaction, having utility in oligonucleotide analog amplification, detection, and in vitro evolution to generate aptamers and oligonucleotide catalysts.
Another purpose of the instant invention is to provide non-standard nucleobases that are easily incorporated by DNA polymerases, RNA polymerases, and reverse transcriptases into the products of template-driven oligonucleotide synthesis. Various analyses of the interaction between polymerases and their substrates suggest that the polymerase seeks two unshared pairs of electrons in the minor groove, at position 3 of the purine (or analog) and at position 2 of the pyrimidine (or analog) [Steitz, T. in Burnett, R. M. and Vogel, H. J. (eds.) Biological Organization: Macromolecular Interactions at High Resolution; Academic Press: New York, 1987, pp. 45-55]. In addition, the base pairs that form three hydrogen bonds are expected to contribute more to duplex stability than pairs joined by just two hydrogen bonds.
These conditions are fulfilled for the compounds disclosed herein for implementing the pyDDA:puAAD hydrogen bonding pattern.