Nucleic acids can provide for different chemical structures on their 5′ end, such as is the case for e.g. a total RNA extract. For instance 28S and 18S ribosomal RNA (rRNA) as well as microRNA (miRNA) and 5′ degraded RNAs have a 5′ monophosphate. Other RNAs have a 5′ diphosphate such as 5S rRNA intermediates or a 5′ OH such as 18S rRNA intermediates and transfer RNA (tRNA) intermediates. Prokaryotic mRNAs have a 5′ triphosphate and eukaryotic mRNAs have a cap structure (1). The cap structure is in its essence a guanosine that is methylated on the N7 of the purine ring and linked on its 5′ position to the 5′ position of the next nucleotide of the RNA through a triphosphate bridge.
As different chemical structures on the 5′ position of the RNAs 5′ nucleotide provide information about the RNAs function and pathway of synthesis and degradation, methods that select for RNA with different 5′ ends are important tools for the analysis of RNA.
Furthermore, for many downstream analyses the RNA is reverse transcribed into cDNA as many powerful methods such as PCR exist for DNA analysis. Therefore it is most desirable that the information of the identity of specific 5′ RNA ends is transferred to the cDNA and at the same time the cDNA is as faithfully a copy of the RNA as possible. This is of special importance in the full length analysis of RNA molecules as only the full length RNA molecule reveals the entire sequence identity of an RNA.
Several methods exist that can select for different classes of RNA. For instance, eukaryotic mRNA is often selected based on the cap structure and/or a 3′ poly A tail. Enrichment processes that target the poly A tail, such as oligo dT column chromatography, are well known in the art. Oligo dT priming during first strand cDNA synthesis is another process to copy mRNAs during reverse transcription. However, methods that select for the poly A tail do not select against 5′ degraded or fragmented mRNA, and certain RNA classes that are not polyadenylated, such as histone mRNA.
Therefore also methods have been developed that use the presence of the cap structure to purify or enrich for mRNA or its cDNA.
For instance, oligo capping (2-4) is one method used to add an oligonucleotide to the 5′ end of the mRNA. It is a multistep protocol that requires the RNA sample to be dephosphorylated leaving only 5′ OH and cap structures. The cap is then cleaved off by tobacco acid phosphatase (TAP) leaving a 5′ monophosphate that can then be used in a subsequent reaction to ligate an oligonucleotide to this 5′ monophosphate. In essence this oligonucleotide provides for a sequence tag that can be selected for, e.g. after reverse transcription, by PCR amplification. Using the oligo capping method in conjunction with selection for full length amplification products (e.g. as described in WO 2007/062445) the present inventors have found that oligo capping introduces considerable bias towards shorter RNA molecules, due to the multiple enzymatic steps utilized that degrade long RNA molecules. Thus, methods are desirable that conserve the RNA or its full length sequence information during amplification and cap structure selection.
The cap trapper method (5;6) is another method that selects for mRNAs sequences by first biotinylating the cap, and then reverse transcribing the RNA. In a further step RNAse I is used to hydrolyze all RNA that is not in a hybrid with the cDNA. Then the mRNA/cDNA hybrids are bound to magnetic avidin beads, effectively selecting for full length cDNA copies that are then further processed.
A similar method is CAPture (7) and EP 373 914 A2 that uses a cap binding protein (eukaryotic initiation factor 4) coupled to a solid support to select for mRNA/cDNA hybrids in conjunction with an RNAse I digestion step.
Efimov et al. (8) and U.S. Pat. No. 6,022,715 provide for a method of chemically ligating an oligonucleotide to the cap of an RNA in a multistep procedure.
Other methods use the property of the reverse transcriptase to add 1-6 cytosines (Cs) to the cDNA strand when reaching the cap to select for these Cs. For instance, the CapSelect (9) ligates an adapter, depending to the presence of 3-4 Cs to the cDNA, to enrich during amplification for such tailed cDNA.
In U.S. Pat. Nos. 5,962,271 and 5,962,272 a method is disclosed that adds a defined sequence to the 3′ end of the cDNA based on the template switching ability of the reverse transcriptase, that can be complemented with the cap dependent C addition to enrich for full length cDNA. For both the Template Switch and the Tailing (CapSelect), the addition of Cs was presumed to be favored if a Cap is present (9). However, others have found that addition also takes place if no Cap is present (10), making these methods not very selective. In addition, when the reverse transcriptase is stopped at secondary or tertiary RNA structures during reverse transcription, again a template switch can occur providing for a spurious tag.
In US 2010/159526 A1 methods are disclosed that add a tagging oligonucleotide to 5′ triphosphates of prokaryotic mRNA by first incubating the RNA with a polyphosphatase to reduce the 5′ triphosphate to a 5′ monophosphate that then can act as a donor to accept a 3′ OH of a tagging oligonucleotide in a subsequent ligation reaction. Again the selection for the prokaryotic mRNA is carried out before cDNA synthesis.
WO 2007/117039 A1 and US 2008/0108804 A1 describe a method wherein mRNA is selected by removing a cap and ligating a nucleic acid molecule to the residual phosphate left by cap removal. To distinguish mRNA from other RNA, non-capped RNA molecules are previously dephosphorylated to prevent ligation thereto. Modified RNA is then transcribed to cDNA.
U.S. Pat. Nos. 6,174,669 and 6,022,715 relate to a further cap modification method wherein the diol structure in the 5′cap is oxidized to form a reactive dialdehyde that is further labeled with a tag. However, oxidation is hazardous to RNA durability and RNA may be degraded.
DE 199 20 611 A1 and US 2003/0049637 A1 describe modifying a cDNA by ligating an adapter for complementary strand synthesis. The adaptor is attached by first synthesizing several Cs by using high Mg and Mn-Ion concentrations during RT and ligating a short oligonucleotide with a terminal transferase.
In DE 101 05 208 A1 it is disclosed that RT efficiency can be increased by including betaine to the RT reaction during cDNA synthesis.
In summary, methods used to date are therefore either not very selective or multistep protocols. As especially RNA is instable, multistep protocols increase the chance of RNA to be fragmented. The use of multiple enzymatic reactions, that very often require divalent cations such as Mg2+ and elevated temperatures, per se leads to RNA degradation. In addition, enzyme preparations are never truly pure and even minor amounts of nucleases increase the chance of RNA degradation. Finally, each additional reaction step also increases the chance that an RNase contamination is introduced that would degrade the RNA.
As longer RNA molecules are at increased risk of degradation than shorter RNA molecules, also a bias towards shorter RNA molecules is introduced. Therefore methods that increase selectivity and sensitivity in tagging specific 5′ ends of a transcript while at the same time offering the possibility to test for the full length sequence information of the RNA are needed.