(1) Field of Invention This invention relates to the field of nucleic acid chemistry, more specifically to nucleotide analogs, and still more specifically to “non-standard” nucleotide analogs that, when incorporated into oligonucleotides (DNA or RNA, collectively xNA), present to a complementary strand in a Watson-Crick pairing geometry a pattern of hydrogen bonds that is different from the pattern presented by adenine, guanine, cytosine, and uracil. Most specifically, this disclosure discloses inventive steps that enable the preparation of function oligonucleotides containing non-standard nucleotides that bind to target molecules (called “aptamers”) or catalyze reactions (called “xNAzymes”) by a process of “in vitro selection” (or IVS). Most specifically, this invention claims processes that comprise the creation of xNA libraries, selecting from those libraries individual xNA molecules that perform the preselected function to generate a fraction of xNA molecules having enhanced performance capabilities, PCR amplifying these with less than 5% loss of the non-standard nucleotide, and determining the sequence of certain of those performing molecules
(2) Description of Related Art
For two decades, many have sought processes that mimic, in the laboratory, biological evolution to select or evolve DNA or RNA (collectively xNA) molecules that act as ligands, receptors, or catalysts [Ellington & Szostak, 1990][Collett, et. al., 2005][Tuerk & Gold, 1990][Breaker & Joyce, 1994]. This process has been called Systematic Evolution of Ligands by Exponential Enrichment (SELEX), “in vitro selection”, or in vitro evolution (collectively referred to as IVS) [Ellington & Szostak 1990] [Tuerk & Gold, 1990]. The xNA ligands and receptors that bind to a preselected target are called aptamers. xNA molecules that catalyze a preselected reaction are called xNAzymes.
As generally practiced, IVS generates aptamers or xNAzymes by the following steps:
(a) A library of nucleic acid (xNA) molecules (typically 1014 to 1014 different species) is obtained.
(b) The library is then fractionated to create a fraction that contains molecules better able bind to the preselected target(s), or catalyze the preselected reaction(s), than molecules in the fractions left behind. For example, to generate aptamers, this separation can be done by contacting the library with a solid support carrying the target, washing from the support xNA molecules that do not bind, and recovering from the support xNA molecules that have bound. xNA molecules within the library that bind to the target are said to survive the selection.
(c) The surviving xNAs are then used as templates for the polymerase chain reaction (PCR) process. A low level of mutation may be included in the PCR amplification, creating Darwinian “variation” in an in vitro evolution process.
(d) While it is conceivable that aptamers/xNAzymes having useful binding/catalytic power may emerge in the first “round” of selection, they generally do not. When they do not, the cycle is repeated. With each cycle of fractionation/selection and PCR amplification, the resulting fraction of xNA molecules becomes more enriched in those that bind to the preselected target or catalyze the preselected reaction.
(e) The product xNA aptamer(s) and xNAzyme(s) might be useful if their sequences are not known. However, the utility of these products is nearly always enhanced if their sequences are known, as this allows them to be generated separately. To obtain those sequences, standard IVS procedures generally clone the xNA products in their DNA form (either directly for DNA products, or after conversion to a DNA sequence using reverse transcriptase for RNA products) followed by classical sequencing. Alternatively, next generation sequence can be applied to the mixture of survivors. The elements of this approach are reviewed in many publications [Irvine et al., 1991][Szostak, 1992].
An early example used IVS to obtain oligonucleotides as ligands for reverse transcriptase [Chen & Gold, 1994]. xNA aptamers have now been obtained for many targets, including small molecules [Famulok, 1999], carbohydrates [Sun, et. al., 2010], and peptides [Gopinath, 2007]. Some aptamers discriminate between closely related proteins [Green, et. al., 1996]. Aptamers have been proposed for and used in therapy [Ng, et. al., 2006] [Nimjee, et. al., 2005]. xNAzymes are known to catalyze a wide range of reactions, including RNA cleavage and RNA ligation.
Many advantages anticipated when xNA molecules replace protein molecules have also been realized. For example, xNA aptamers can be reversibly unfolded and refolded, permitting them to be regenerated, giving them longer useful lifetimes and better storage properties [Collett et al., 2005]. xNA aptamers can be inexpensive to prepare on large scale once their sequence has been identified, especially if they can be made using enzymatic amplification tools, such as PCR.
xNA aptamers have other advantages. For example, strategies that allow xNA-based aptamers to directly signal the presence of a bound target are easier to conceive than for antibodies. For example, fluorescent signaling is possible by simply attaching donors and quenchers to specific sites in a DNA sequence [Hiep, et. al., 2010]. More prospectively, but not unreasonably, aptamers might generate electrical readouts when their target is bound [Hayashi, et. al., 2010]; this could reduce enormously the cost of multiplexed diagnostics.
Nevertheless, xNA aptamers have not replaced antibodies, and xNAzymes are not widely used as practical catalysts. While not wishing to be bound by theory, as IVS technology matured, it became clear that the diversity, and binding power of xNA aptamers did not match that of proteins [Proske, et. al., 2005] [Hamula, et. al., 2006], nor did the catalytic power of xNAzymes. These limitations were exemplified and discussed [Li, et. al., 2009]. The limitations of xNA molecules as catalysts has also been discussed [Carrigan, et. al., 2004].
In retrospect, this disappointing outcome might be viewed as unsurprising. Proteins are built from 20 different amino acid building blocks that carry much chemical functionality, including positively charged nitrogens on lysine and arginine, general acid-base functionality on histidine, hydrophobic groups on leucine and others, polarizable binding groups (as on tryptophan and methionine), metal coordinating groups (cysteine, histidine, and others), and so on. Structural biology and mechanistic biochemistry identifies roles for all of these in the binding between proteins and their ligands. In contrast, nucleic acids carry little of this functionality.
Further, with only four building blocks, nucleic acids have fewer motifs for folding than proteins. For example, a G-rich region might lead to a particular “G-quartet”, desired to form a specific binding site for a particular target. However, this quartet might be in equilibrium with an alternative folding motif based on G's elsewhere in a sequence involving G:C pairing. The alternative fold need not have any affinity for a target. There are only a limited numbers of interaction types that can be achieved in DNA with just four letters. Further, with low information density arising from four different building blocks, it is difficult to obtain unambiguous folds from standard xNAs. Those attempting to build nanostructures from DNA biobricks have also encountered this as an obstacle to achieving their goals [Smolke 2009]. Further, even if the desired fold is the thermodynamic minimum, it can be kinetically slow to achieve, again because of the low information density in standard xNA.
Of course, stronger affinity is seen with targets having a natural propensity to bind to xNA molecules. For example, aptamers selected to bind HIV integrase, reverse transcriptase, and nucleocapsid proteins have affinities of 10-800, 0.3-20, and 2 nM [Burke et al., 1996] [Allen et al., 1995] [Schneider et al., 1995] [Allen et al., 1996]). Targets with an overall positive charge, complementary to the negative charge of xNA molecules, can also be low, as shown by aptamers to PDGF (0.1 nM [Green et al., 1996]), thrombin (25 nM [Bock et al., 1992])
The limitations of standard DNA and RNA aptamers is evidenced by the number of laboratories that have attempted to surmount the [Battersby, et. al., 1999][Hollenstein, et. al., 2009a][Hollenstein, et. al., 2009b]. Again not wishing to be bound by theory, one hypothesis that holds that the limitations of aptamers and xNAzymes compared to, for example, antibodies and protein enzymes, arise from the relatively little functionality in xNA, compared to proteins.
Pursuing this hypothesis, the Perrin group at the University of British Columbia made DNA where each of the four standard building blocks (G, A, C, and T, or GACT) carries a different functional group [Hollenstein, et al., 2009a][Hollenstein, et al., 2009b]. They report improvement in catalytic power in this system. A decade earlier, the Benner group introduced a single functional group to an ATP aptamer [Battersby, et. al., 1999]. Others have added hydrophilic and hydrophobic groups [Vaught, et. al., 2010] [Zichi, et. al., 2008]. SomaLogic modified uridines at the 5-position of pyrimidines with benzyl, naphthyl, tryptamino, or isobutyl groups, generating SOMAmers (Slow Off rate Modified Aptamers) [Gold, et al., 2010] [Kraemer, et al., 2011].
However, simply functionalizing standard xNA nucleotides (as in SOMAmers) does not greatly expand its diversity of folds. Nor does it increase the information density of the biopolymer. Further, functionalizing GACT encounters a new set of problems. For example, an xNA molecule having a fluorescent group attached to each nucleobase [Brakmann & Nieckchen, 2001] [Brakmann & Lobermann, 2002] are hard to make using xNA polymerases [Ramsay, et. al., 2010]. Further, in ways that are not fully understood, having each nucleobase carry a functional group can cause the DNA to cease to follow “rule based” molecular recognition essential for its genetic roles.
One solution to this impasse involves expanding the number of letters in the DNA alphabet. For example, rearranging hydrogen bond donor and acceptor groups on the nucleobases can increase the number of independently replicable nucleosides in DNA and RNA from four to twelve (FIG. 1) [Switzer, et. al., 1989][Piccirilli, et. al., 1990]. In this “artificially expanded genetic information system” (AEGIS), 12 different nucleotide “letters” pair via six distinguishable hydrogen bonding patterns to give a system that can, in principle, pair, be copied, and evolve like natural DNA, but with higher information density and more functional group diversity.
The potential for using AEGIS to support IVS has been recognized since the proposal of the first AEGIS. Indeed, processes for doing IVS with certain AEGIS-containing nucleotides were claimed by U.S. Pat. No. 5,965,363. However, efforts to implement the process disclosed in that patent have failed. Steps (a) and (b) (above) in the IVS process were possible. Libraries of xNA molecules containing AEGIS components could be prepared, Step (a), and these libraries could be fractionated (Step (b)). However, polymerases were not available to perform PCR on DNA molecules containing multiple AEGIS nucleotides [Sismour, et. al., 2004]. Further, even after polymerases that copied AEGIS nucleotides were obtained, repeated PCR cycling saw their loss [Johnson, et. al., 2004], by perhaps as much as 5% loss per cycle seen when isoguanosine was used to implement the puDDA hydrogen bonding pattern. Efforts to prevent their loss led to DNA molecules with multiple sulfur atoms [Sismour, et. al., 2005], undesirable for many applications. Still other AEGIS components suffered epimerization, which prevented their being routinely copied [Huffer & Benner, 2003].
Further, even if components in a library of AEGIS-containing oligonucleotides could be amplified and the AEGIS components retained, no downstream tools were available to clone the AEGIS-containing xNA aptamers or xNAzymes. Bacteria were not known to accept AEGIS components. Further, no process was available to sequence AEGIS-containing xNA aptamers.
After many years of attempting to do IVS based on libraries of AEGIS-containing oligonucleotides, it is clear that any claims covering an AEGIS-based IVS in the prior art were simply not enabled. This specifically includes the process claimed by U.S. Pat. No. 5,965,363.