In nucleic acids (DNA, RNA) which are biological macromolecules, enormous amounts of genetic information essential for vital activities are recorded as sequences composed of combinations of only 4 different bases. Such a nucleic acid allows self-replication using itself as a template by the action of DNA polymerases, and further undergoes processes of RNA polymerase-mediated transcription and ribosome-mediated translation to ensure the transmission of genetic information from DNA to DNA, from DNA to RNA, and/or from RNA to protein. These replication and transmission events of genetic information enabled exclusive base-pairings (A:T/U, G:C). In addition, nucleic acids can form a variety of higher-order structures and hence exert various functions. By way of example, it is one of the indications that a large number of novel nucleic acids having aptamer and/or ribozyme functions have been generated by in vitro selection techniques.
However, unlike proteins which are composed of 20 types of amino acids, the chemical and physical diversity of nucleic acids is limited by the fact that there are only 4 different bases (2 base pairs) in natural nucleic acids. For example, functional RNAs (e.g., tRNA, rRNA, mRNA) found in living organisms utilize various modified bases to stabilize their own structures and/or RNA-RNA and RNA-protein interactions. Thus, it will be very advantageous to expand the repertory of new bases (base pairs) in developing novel functional nucleic acids.
With the aim of further expansion of nucleic acid functions, attempts have been made to design nucleosides or nucleotides having unnatural bases. There are two possible approaches for introducing modified bases (or unnatural bases) into nucleic acids: 1) direct introduction by chemical synthesis; and 2) introduction catalyzed by DNA and RNA polymerase enzymes. In the case of 1), there is a need to solve some problems associated with chemical synthesis, such as the stability of amidite units and the presence of protecting groups appropriate for base moieties. If these problems are solved, various unnatural bases can be introduced in a site-selective manner. However, the nucleic acids thus obtained are difficult to be amplified and it is also difficult to synthesize long-chain nucleic acids. In the case of 2), if the enzymes recognize substrates to cause replication and transcription between artificial base pairs in a complementary manner, nucleic acids containing such artificial base pairs can be amplified and prepared. However, such substrates and base pairs (unnatural nucleotides) are still under development.
Research Background of Unnatural Artificial Base Pairs
In natural double-stranded DNA, the “exclusive” A-T and G-C base pairs are formed through specific hydrogen bonding. Studies of unnatural base pairs have been known for combinations based on hydrogen bonding between bases and/or combinations based on the hydrophobicity of bases, but no unnatural base pair has been found that can compete with natural base pairs in all steps of replication, transcription and translation. Under these circumstances, an unnatural base pair capable of competing with natural base pairs in at least one step of replication, transcription and translation will have a specific utility.
Recent studies of unnatural non-hydrogen-bonded base pairs have shown that hydrogen bonding between paired bases is not absolutely required for efficient and adequate incorporation of nucleotides during replication; rather, shape complementarity between bases plays an important role during replication (Non-patent Documents 5-6). Thus, non-hydrogen-bonded base pairs are potential candidates for expansion of the genetic alphabet and code to create novel biotechnologies (Non-patent Documents 7-9).
On the other hand, no report has been issued on a non-hydrogen-bonded base pair that is recognized by RNA polymerase with high selectivity and efficiency during transcription. More specifically, the inventors of the present invention have developed a (s-y) base pair through interaction of hydrogen bonding between a 2-amino-6-(2-thienyl)-9H-purin-9-yl group (s) and a 2-oxo-(1H)pyridin-3-yl group (y), and have succeeded in in vitro synthesis of proteins site-specifically containing an unnatural amino acid(s) (Patent Document 1, Non-patent Document 14).
Moreover, Patent Document 3 discloses a (v-y) base pair between a 2-amino-6-(2-thiazolyl)purin-9-yl group (v) and a 2-oxo-(1H)pyridin-3-yl group (y) (Patent Document 3).
As the nucleic acid base pairs of s, (s-z) (wherein z represents a 2-oxo-1,3-dihydroimidazol-1-yl group) and (s-5-substituted y) other than the four natural bases and (s-y) has been reported prior to the present invention. These unnatural (s-y), (s-z) and (s-5-substituted y) base pairs all involve hydrogen bonding interaction (Patent Document 2, Non-patent Documents 13-15).
However, among these unnatural base pairs, (s-y) and (s-5-substituted y) are less selective in a transcription step where s is inserted into RNA using y as a DNA template, while (s-z) is highly selective, but does not necessarily provide high yield.

On the other hand, a 2-formyl-1H-pyrrol-1-yl group (Pa) has been reported to be able to form non-hydrogen-bonded base pairs not only with the four natural bases, but also with 9-methylimidazo[(4,5)-b]pyridine (Q) (Pa-Q) (Non-patent Document 9).

However, the above non-hydrogen-bonded base pairs are reported only for translation by DNA polymerase or DNA synthesis by reverse transcriptase. There is no knowledge about their recognition by RNA polymerase. A single subunit of T7-like RNA polymerase resembles DNA polymerase both in structure and mechanism (Non-patent Documents 16-17), but recent structural analyses of T7 RNA polymerase complexes have clarified differences between RNA polymerase and DNA polymerase (Non-patent Documents 17-18). In the case of transcription, an incoming substrate forms a base pair with a base in its template in the “open” conformation, and the formed base pair is maintained during transition from “open” to “closed” conformation (Non-patent Document 18). However, in the case of replication, base pairing starts after transition to the “closed” conformation. This suggests that hydrogen bonding between paired bases is more important in transcription than in replication, which in turn leads to a question of whether non-hydrogen-bonded base pairs could be functional during transcription.
If unnatural base-mediated specific transcription is achieved, it will be possible to design novel RNA molecules having promoted functions and to expand the genetic code (Non-patent Documents 1-4).
Introduction of Fluorescent Probes into RNA
In response to recent growing interest in RNA therapy, introduction of fluorescent probes into RNA have become an important technology for fluorescently labeling RNA and for analyzing the complex higher order structure of RNA. Fluorescent probes previously used for the latter structural analysis include those using a nucleotide having a fluorescent base (e.g., 2-aminopurine). In this regard, the 2-amino-6-(2-thienyl)-9H-purin-9-yl group (s) mentioned above is a fluorescent base, and a nucleotide containing this group as a base has strong fluorescence properties. Thus, if a nucleotide containing s as a base can be introduced at any site in RNA, it will be possible to prepare a useful fluorescent probe.
However, known techniques used for introducing such a nucleotide having a fluorescent base into RNA through transcription are not necessarily sufficient in terms of yield. It has therefore been substantially impossible to introduce a nucleotide having a fluorescent base (e.g., s) into RNA in a site-specific manner through transcription, thus making it difficult to analyze a higher order structure of long-chain RNA.
Once a technique has been developed for introducing a fluorescent probe into RNA through transcription, it will be possible to study structural dynamics of RNA in a solution. Moreover, such a technique would be also helpful in developing RNA-based drugs. For these reasons, it is important to establish a technique that allows introduction of a fluorescent probe into long-chain RNA, which has thus far been regarded as impossible. This would enable the development of new techniques for use in in vivo structural analysis of functional RNA and in structural analysis of RNA complexes with other molecules.
There is a demand for the development of a novel artificial base pair for use in introducing, into a nucleic acid, a nucleotide having an unnatural base which can compete with natural base pairs in all steps of replication, transcription and translation and which imparts fluorescence properties to a nucleic acid.
Patent Document 1: WO2001/005801
Patent Document 2: WO2004/007713
Patent Document 3: WO2005/026187
Non-patent Document 1: Benner, S. A., Burgstaller, P., Battersby, T. R. & Jurczyk, S. in The RNA World (eds Gesteland, R. F., Cech, T. R. & Atkins, J. F.) 163-181 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1999).
Non-patent Document 2: Bergstrom, D. E. Orthogonal base pairs continue to evolve. Chem. Biol. 11, 18-20 (2004).
Non-patent Document 3: Wang, L. & Schultz, P. G. Expanding the genetic code. Chem. Commun. 1-11 (2002).
Non-patent Document 4: Hendrickson, T. L., de Crecy-Lagard, V. & Schimmel, P. Incorporation of nonnatural amino acids into proteins. Annu. Rev. Biochem. 73, 147-176 (2004).
Non-patent Document 5: Morales, J. C. & Kool, E. T. Efficient replication between non-hydrogen-bonded nucleoside shape analogs. Nat. Struct. Biol. 5, 950-954 (1998).
Non-patent Document 6: Kool, E. T. Hydrogen bonding, base stacking, and steric effects in DNA replication. Annu. Rev. Biophys. Biomol. Struct. 30, 1-22 (2001).
Non-patent Document 7: McMinn, D. L. et al. Efforts toward expansion of the genetic alphabet: DNA polymerase recognition of a highly stable, self-pairing hydrophobic base. J. Am. Chem. Soc. 121, 11585-11586 (1999).
Non-patent Document 8: Wu, Y. et al. Efforts toward expansion of the genetic alphabet: optimization of interbase hydrophobic interactions. J. Am. Chem. Soc. 122, 7621-7632 (2000).
Non-patent Document 9: Mitsui, T. et al. An unnatural hydrophobic base pair with shape complementarity between pyrrole-2-carbaldehyde and 9-methylimidazo[(4,5)-b]pyridine. J. Am. Chem. Soc. 125, 5298-5307 (2003).
Non-patent Document 10: Piccirilli, J. A., Krauch, T., Moroney, S. E. & Benner, S. A. Enzymatic incorporation of a new base pair into DNA and RNA extends the genetic alphabet. Nature 343, 33-37 (1990).
Non-patent Document 11: Switzer, C. Y., Moroney, S. E. & Benner, S. A. Enzymatic recognition of the base pair between isocytidine and isoguanosine. Biochemistry 32, 10489-10496 (1993).
Non-patent Document 12: Tor, Y. & Dervan, P. B. Site-specific enzymatic incorporation of an unnatural base, N6-(6-aminohexyl)isoguanosine, into RNA. J. Am. Chem. Soc. 115, 4461-4467 (1993).
Non-patent Document 13: Ohtsuki, T. et al. Unnatural base pairs for specific transcription. Proc. Natl. Acad. Sci. U.S.A. 98, 4922-4925 (2001).
Non-patent Document 14: Hirao, I. et al. An unnatural base pair for incorporating amino acid analogs into proteins. Nat. Biotechnol. 20, 177-182 (2002).
Non-patent Document 15: Hirao, I. et al. A two-unnatural-base-pair system toward the expansion of the genetic code. J. Am. Chem. Soc. 126, 113298-113305 (2004).
Non-patent Document 16: Cheetham, G. M., Jeruzalmi, D. & Steitz, T. A. Structural basis for initiation of transcription from an RNA polymerase-promoter complex. Nature 399, 80-83 (1999).
Non-patent Document 17: Yin, Y. W. & Steitz, T. A. The structural mechanism of translocation and helicase activity in T7 RNA polymerase. Cell 116, 393-404 (2004).
Non-patent Document 18: Temiakov, D. et al. Structural basis for substrate selection by T7 RNA polymerase. Cell 116, 381-391 (2004).
Non-patent Document 19: Hirao, I., Fujiwara, T., Kimoto, M. & Yokoyama, S. Unnatural base pairs between 2- and 6-substituted purines and 2-oxo(1H)pyridine for expansion of the genetic alphabet. Bioorg. Med. Chem. Lett. 14, 4887-4890 (2004).
Non-patent Document 20: Jovine, L., Djordjevic, S. & Rhodes, D. The crystal structure of yeast phenylalanine tRNA at 2.0 Å resolution: cleavage by Mg2+ in 15-year old crystals. J. Mol. Biol. 301, 401-414 (2000).
Non-patent Document 21: Law, S. M., Eritja, R., Goodman, M. F. & Breslauer, K. J. Spectroscopic and calorimetric characterizations of DNA duplexes containing 2-aminopurine. Biochemistry 35, 12329-12337 (1996).
Non-patent Document 22: Holz, B., Klimasauskas, S., Serva, S. & Weinhold, E. 2-Aminopurine as a fluorescent probe for DNA base flipping by methyltransferases. Nucleic Acids Res. 26, 1076-1083 (1998).
Non-patent Document 23: Rist, M. J. & Marino, J. P. Association of an RNA kissing complex analyzed using 2-aminopurine fluorescence. Nucleic Acids Res. 29, 2401-2408 (2001).
Non-patent Document 24: Bain, J. D., Switzer, C., Chamberlin, A. R. & Benner, S. A. Ribosome-mediated incorporation of a non-standard amino acid into a peptide through expansion of the genetic code. Nature 356, 537-539 (1992).
Non-patent Document 25: Ludwig, J. & Eckstein, F. Rapid and efficient synthesis of nucleoside 5′-O-(1-thiotriphosphates), 5′-triphosphates and 2′,3′-cyclophosphorothioates using 2-chloro-4H-1,3,2-benzodioxaphosphorin-4-one. J. Org. Chem. 54, 631-635 (1989).
Non-patent Document 26: Frugier, M., Helm, M., Felden, B., Giege, R. & Florentz, C. Sequences outside recognition sets are not neutral for tRNA aminoacylation. J. Biol. Chem. 273, 11605-11610 (1998).
Non-patent Document 27: Kao, C., Rudisser, S. & Zheng, M. A simple and efficient method to transcribe RNAs with reduced 3′ heterogeneity. Methods 23, 201-205 (2001).
Non-patent Document 28: Mitsui, T., Kimoto, M., Sato, A., Yokoyama, S. & Hirao, I., Bioorg. Med. Chem. Lett., 13, 4515-4518 (2003)
Non-patent Document 29: Fujiwara, T., Kimoto, M., Sugiyama, H., Hirao, I. & Yokoyama, S., Bioorg. Med. Chem. Lett., 11, 2221-2223 (2001)
Non-patent Document 30: Mitsui, T., Kimoto, M., Harada, Y., Yokoyama, S. & Hirao, I., J. Am. Chem. Soc., 127, 8652-8658 (2005)