Some genes have their coding sequences interrupted by stretches of non-coding DNA. These non-coding sequences are termed introns. To produce a mature transcript from these genes, the primary RNA transcript (precursor RNA) must undergo a cleavage-ligation reaction termed RNA splicing. This RNA splicing produces the mature transcript of the polypeptide coding messenger RNA (mRNA), ribosomal RNA, or transfer RNA (tRNA). Introns are grouped into four categories (Groups I, II, III, and IV) based on their structure and the type of splicing reaction they undergo.
Of particular interest to the present invention are the Group I introns. Group I introns undergo an intra-molecular RNA splicing reaction leading to cyclization that does not require protein cofactors, Cech, Science, 236:1532-1539 (1987).
The Group I introns, including the intron isolated from the large ribosomal RNA precursor of Tetrahymena thermophila, have been shown to catalyze a sequence-specific phosphoester transfer reaction involving RNA substrates. Zaug and Cech, Science, 229:1060-1064 (1985); and Kay and Inoue, Nature, 327:343-346 (1987). This sequence-specific phosphoester transfer reaction leads to the removal of the Group I intron from the precursor RNA and ligation of two exons in a process known as RNA splicing. Splicing reaction catalyzed by Group I introns proceeds via a two-step transesterification mechanism. The details of this reaction have been recently reviewed by Cech, Science, 236:1532-1539 (1987).
The splicing reaction of Group I introns is initiated by the binding of guanosine or a guanosine nucleotide to a site within the Group I intron structure. Attack at the 5' splice site by the 3'-hydroxyl group of guanosine results in the covalent linkage of guanosine to the 5' end of the intervening intron sequence. This reaction generates a new 3'-hydroxyl group on the uridine at the 3' terminus of the 5' exon. The 5' exon subsequently attacks the 3' splice site, yielding spliced exons and the full-length linear form of the Group I intron.
The linear Group I intron usually cyclizes following splicing. Cyclization occurs via a third transesterification reaction, involving attack of the 3'-terminal guanosine at an interval site near the 5' end of the intron. The Group I introns also undergo sequence specific hydrolysis reaction at the splice site sequences as described by Inoue et al., J. Mol. Biol., 189:143-165 (1986). This activity has been used to cleave RNA substrates in a sequence specific manner by Zaug et al., Nature, 324:429-433 (1986).
The structure of Group I introns has been recently reviewed by J. Burke, Gene, 73:273-294 (1988). The structure is characterized by nine base paired regions, termed P1-P9 as described in Burke et al., Nucleic Acids Res., 15:7217-7221 (1987). The folded structure of the intron is clearly important for the catalytic activity of the Group I introns as evidenced by the loss of catalytic activity under conditions where the intron is denatured. In addition, mutations that disrupt essential base-paired regions of the Group I introns result in a loss of catalytic activity. Burke, Gene, 73:273-294 (1988). Compensatory mutations or second-site mutations that restore base-pairing in these regions also restore catalytic activity. Williamson et al., J. Biol. Chem., 262:14672-14682 (1987); and Burke, Gene, 73:273-294 (1988).
Several different deletions that remove a large nucleotide segment from the Group I introns (FIG. 2) without destroying its ability to cleave RNA have been reported. Burke, Gene, 73:273-294 (1988). However, attempts to combine large deletions have resulted in both active and inactive introns. Joyce et al., Nucleic Acid Res., 17:7879 (1989).
To date, Group I introns have been shown to cleave substrates containing either RNA, or RNA and DNA. Zaug et al., Science, 231:470-475 (1986); Sugimoto et al., Nucleic Acids Res., 17:355-371 (1989); and Cech, Science, 236:1532-1539 (1987). A DNA containing 5 deoxycytosines was shown not to be a cleavage substrate for the Tetrahymena IVS, a Group I intron by Zaug et al., Science, 231:470-475 (1986).