Nucleic acids amplify by self-complementarity of A-T (U) and G-C base pairs, and function as catalysts and ligands. However, the functions of DNA and RNA molecules are restricted by a limitation in number, which is a result of a fact that natural nucleic acids are formed of nucleotides consisting of only four bases as compared with the twenty different amino acids in natural proteins. Unnatural base pair systems offer a solution to this problem because they can be used for addition to kind of bases of nucleic acids and thereby expand genetic information (non-patent documents 1-5). Unnatural base pairs are required to have high specific complementarity allowing for site-specific incorporation of specific nucleotide analogs into DNA and RNA via a polymerase catalytic reaction. If this became possible, the current genetic engineering limited by the number of naturally occurring bases could be replaced by a novel technique using an unnatural base pair system.
The first attempt to generate an unnatural base pair was made by Benner et al. (non-patent documents 6-7). They developed several unnatural base pairs having different hydrogen-binding patterns from those of natural base pairs, such as isoguanine-isocytosine (isoG-isoC) and xanthosine-diaminopyrimidine. Recently, these unnatural base pairs were applied to PCR amplification of DNA fragments containing such base pairs (non-patent documents 8-9), and sequence analysis (non-patent document 10). However, fidelity was relatively moderate, and/or required application of a complex procedure.
Subsequently, Kool et al. synthesized hydrophobic bases having similar shapes to those of natural bases but lacking an ability to form hydrogen bonds in base pairing (non-patent documents 11-12). These hydrophobic bases were selectively recognized by DNA polymerases, which recognition is suggestive of an importance of geometric complementarity between base pairs rather than hydrogen bonding interaction in replication. Recently, a series of hydrophobic base pairs were developed by Romesberg et al., and these base pairs were complementarily incorporated into DNA by Klenow fragment of DNA polymerase I derived from E. coli (non-patent documents 13-15). However, the hydrophobic bases caused non-specific incorporation between the hydrophobic bases in replication without following geometrical complementarity (non-patent document 14).
By combining the concepts of hydrogen bonding pattern and geometric complementarity, the present inventors were able to develop unnatural base pairs between 2-amino-6-(2-thienyl) purine (s) and 2-oxopyridine (y) (patent document 1, non-patent documents 16-17), and between 2-amino-6-(2-thiazolyl)purine (v) and y (patent document 2, non-patent document 18). Bulky substituents at position 6 of s and v efficiently inhibited undesirable base pairing with natural bases (non-cognate pairing), and substrates for y and modified y bases (nucleoside 5′-triphosphates) were site-specifically incorporated into RNA by T7RNA polymerase complementarily to s or v in the template. This specific transcription can be practicable as a means for developing functional RNA molecules (non-patent documents 19-21), but the selectivity of the s-y and v-y base pairs in replication is not much higher than the selectivity in transcription (non-patent documents 16, 18).
Unnatural base pairs approaching a commercial level in replication have been reported, such as a P—Z base pair (P: 2-amino-imidazo[1,2-a]-1,3,5-triazine-4(8H)-one and Z: 6-amino-5-nitro-2(1H)-pyrrolidone) of S. A. Benner et al, in U.S. (non-patent document 22); an isoG-isoC base pair of EraGen in U.S. (patent document 3, and non-patent document 9); and a Ds-Pa base pair and a Ds-Pn base pair (wherein Ds means a 7-(2-thienyl)-3H-imidazo[4,5-b]pyridine-3-yl group, Pa means a 2-formyl-1H-pyrrole-1-yl group, and Pn means a 2-nitro-1H-pyrrole-1-yl group, respectively) of Hirao et al, who are also inventors of the present invention (patent document 4, and non-patent documents 23 and 24). However, the unnatural base pairs of Benner et al, and EraGen suffer from a low selectivity in replication, limitation in a number of PCR cycles, and difficulty in detecting minor amounts of DNA. The unnatural base pairs of Hirao et al, have high selectivity, but require use of special substrates for their replication and the PCR amplification efficiency is not significantly high.
The conservation rate of unnatural bases in DNA during one cycle of PCR amplification is 97.5% in the P—Z base pair of Benner et al., ˜96% in the isoG-isoC base pair of EraGen, and ˜99% in the Ds-Pa base pair and Ds-Pn base pair previously developed by the present inventors. If the conservation rate of unnatural base pairs in PCR is 97.5%, only about 60% (0.97520=0.60) of unnatural base pairs exist in the DNA finally amplified after 20 cycles of PCR. Thus, application of the P—Z and isoG-isoC base pairs is not easy for carrying out various techniques that are based on nucleic acid replication/amplification reactions in which only minor (small) amounts of DNA are employed. Moreover, no sequencing method of DNA containing these unnatural base pairs has been reported to have been deployed on a commercial scale.
The Ds-Pa and Ds-Pn base pairs previously developed by the present inventors are assumed to exist at a level of 82% (0.9920=0.82) in the DNA amplified by 20 cycles of PCR. However, there is a need to develop base pairs having further higher conservation rates than the Ds-Pa and Ds-Pn base pairs in order to apply them to various techniques based on nucleic acid replication/amplification reactions. Moreover, PCR amplification of DNA containing these unnatural base pairs require the use of somewhat special modified substrates (γ-amidotriphosphate derivatives), thus complicating operation. In addition, the locations of unnatural base pairs in DNA can be confirmed by sequencing, but the results of sequencing may be perturbed depending on a sequence of natural base pairs in the proximity of the unnatural base pairs and is in need of being improved so as to provide increased generality.