Recent studies have shown that almost all parts of the human genome, including even so-called “non-coding regions”, are transcribed into RNA (e.g., see Genome Research Volume 17, Issue 6: June 2007). As a result, there is currently great interest in identifying, characterizing and determining the biological fate and functions of all transcribed RNAs, including mRNAs, non-coding RNAs, such as microRNAs (miRNAs) or their pri-miRNA or pre-miRNA precursors, and other RNA molecules, including those which have not been identified.
There is also continuing interest to identify and analyze expression of various RNA molecules in order to understand differentiation, biological responses to environment, and other biological processes in normal and abnormal cells in eukaryotes. For example, there is great interest to study disease-related RNA molecules in eukaryotic cells in order to understand the initiation and progression of each disease and, hopefully, to find treatments or ways to prevent the disease or the disease progression.
With respect to diseases of eukaryotes caused by pathogenic bacteria, mycoplasma, and viruses, there is great interest to identify, characterize and determine the biological functions of RNAs encoded by genomes of both the host and the pathogen during the course of infection, disease initiation, and disease progression.
The nature of the 5′ ends of different classes of RNA molecules plays an important role in their biological structure and function. The chemical moieties on the 5′ ends of an RNA molecules influence their structure, stability, biochemical processing, transport, biological function and fate in a cell or organism. The chemical moieties commonly found at the 5′ ends of different RNA classes include triphosphates, monophosphates, hydroxyls, and cap nucleotides. The particular chemical moiety on the 5′ end provides important clues to the origin, processing, maturation and stability of the RNA. Characterization of this moiety in a newly identified RNA could even suggest a role for the RNA in the cell. Therefore, methods that can discriminate between classes of RNA molecules that contain different 5′ end groups are important tools for characterizing, studying, and manipulating RNA.
For example, bacterial mRNAs typically have a triphosphate group on their 5′ ends. Still further, many eukaryotic RNAs that are not translated into protein, referred to as “non-coding RNAs” or “ncRNAs,” have been described, and many of these ncRNAs have a 5′ triphosphate group. In addition, small prokaryotic and eukaryotic ribosomal RNAs (e.g., 5S or 5.8S rRNAs), and transfer RNAs (tRNAs) typically have a 5′ triphosphate group.
Most eukaryotic cellular mRNAs and most eukaryotic viral mRNA transcripts are “capped” at their 5′ terminus. A “cap” or “cap nucleotide” consists of a guanine nucleoside that is joined via its 5′-carbon to a triphosphate group that is, in turn, joined to the 5′-carbon of the most 5′-nucleotide of the primary mRNA transcript, and in most eukaryotes, the nitrogen at the 7 position of guanine in the cap nucleotide is methylated. Thus, most eukaryotic cellular mRNAs and most eukaryotic viral mRNAs have an “N7-methylguanosine” or “m7G” cap or cap nucleotide on their 5′ ends.
In addition to eukaryotic cellular and viral mRNAs, some ncRNAs are also capped, and some capped ncRNAs also have a 3′ poly(A) tail, like most eukaryotic mRNAs. For example, Rinn, J L et al. (Cell 129: 1311-1323, 2007) described one capped and polyadenylated 2.2-kilobase ncRNA encoded in the HOXC region of human chromosome 12, termed “HOTAIR,” that has profound effects on expression of HOXD genes on chromosome 2. In addition, some other eukaryotic RNAs in a sample, such as small nuclear RNAs (“snRNAs”), and pre-miRNAs, can be capped.
The 5′ caps of eukaryotic cellular and viral mRNAs (and some other forms of RNA) play important roles in mRNA metabolism, and are required to varying degrees for processing and maturation of an mRNA transcript in the nucleus, transport of mRNA from the nucleus to the cytoplasm, mRNA stability, and efficient translation of the mRNA to protein. For example, the cap plays a pivotal role in the initiation of protein synthesis and in eukaryotic mRNA processing and stability in vivo. The cap provides resistance to 5′ exoribonuclease (XRN) activity and its absence results in rapid degradation of the mRNA (e.g., see Mol. Biol. Med. 5: 1-14, 1988; Cell 32: 681-694, 1983). Thus, mRNA prepared (e.g., in vitro) for introduction (e.g., via microinjection into oocytes or transfection into cells) and expression in eukaryotic cells should be capped.
Many eukaryotic viral RNAs are infectious only when capped, and when RNA molecules that are not capped (i.e., they are “uncapped”) are introduced into cells via transfection or microinjection, they are rapidly degraded by cellular RNases (e.g., see Krieg, and Melton, Nucleic Acids Res. 12: 7057, 1984; Drummond, et al. Nucleic Acids Res. 13: 7375, 1979).
The primary transcripts of many eukaryotic cellular genes and eukaryotic viral genes require processing to remove intervening sequences (introns) within the coding regions of these transcripts, and the benefits of the cap also extend to stabilization of such pre-mRNA. For example, it was shown that the presence of a cap on pre-mRNA enhanced in vivo splicing of pre-mRNA in yeast, but was not required for splicing, either in vivo or using in vitro yeast splicing systems (Fresco, L D and Buratowski, S, RNA 2: 584-596, 1996; Schwer, B et al., Nucleic Acids Res. 26: 2050-2057, 1998; Schwer, B and Shuman, S, RNA 2: 574-583, 1996). The enhancement of splicing was primarily due to the increased stability of the pre-mRNA since, in the absence of a cap, the pre-mRNA was rapidly degraded by 5′ exoribonuclease (Schwer, B, Nucleic Acids Res. 26: 2050-2057, 1998). Thus, it is also beneficial that transcripts synthesized for in vitro RNA splicing experiments are capped.
While capped mRNA remains in the cytoplasm after being exported from the nucleus, some other RNAs, such as some snRNAs have caps that are further methylated and then imported back into the nucleus, where they are involved in splicing of introns from pre-mRNA to generate mRNA exons (Mattaj, Cell 46: 905-911, 1986; Hamm et al., Cell 62: 569-577, 1990; Fischer, et al., J. Cell Biol. 113: 705-714, 1991).
The splicing reaction generates spiced intron RNA that initially comprises RNA that has a 5′ monophosphate group. Thus, at least some initially-generated intron RNA molecules from pre-mRNA splicing reactions also have a 5′ phosphate group. In addition, some other RNAs, such as eukaryotic or viral-encoded micro RNAs (miRNAs), and both eukaryotic and prokaryotic large ribosomal RNA molecules (rRNA), including 18S and 26S or 28S eukaryotic rRNAs, or 16S and 23S prokaryotic rRNAs, have a monophosphate group on their 5′ ends.
RNase A-degraded RNAs and some other endonucleolytically processed RNA molecules have a 5′ hydroxyl group.
Enzymes that modify the 5′ ends of RNA are useful tools for characterizing and manipulating various RNA molecules in vitro. For example, alkaline phosphatase (AP) (e.g., APEX™ alkaline phosphatase (EPICENTRE), shrimp alkaline phosphatase (USB, Cleveland, Ohio), or Arctic alkaline phosphatase (New England Biolabs, MA) converts the 5′ triphosphates of uncapped primary RNA and the 5′ monophosphates of rRNA to 5′ hydroxyl groups, generating RNAs that have a 5′ hydroxyl group, but does not affect capped RNA. Nucleic acid pyrophosphatase (PPase) (e.g., tobacco acid pyrophosphatase (TAP)) cleaves the triphosphate groups of both capped and uncapped RNAs to synthesize RNAs that have a 5′ monophosphate group. A decapping enzyme (e.g., yeast decapping enzyme, mammalian decapping enzyme, Arabidopsis thaliana decapping enzyme, or vaccinia virus decapping enzymes D9 or D10) converts capped RNA (e.g., m7G-capped RNA) to RNA that has a 5′ monophosphate group. A capping enzyme (e.g., SCRIPTCAP™ capping enzyme, EPICENTRE; poxvirus capping enzyme; vaccinia virus capping enzyme; or Saccharomyces cerevisiae capping enzyme RNA triphosphatase) converts RNA that has a 5′ triphosphate group or RNA that has a 5′ diphosphate group to capped RNA. Polynucleotide kinase (PNK; e.g., T4 PNK) monophosphorylates hydroxyl groups on the 5′ ends of RNA molecules and removes monophosphate groups on the 3′ ends of RNA molecules (e.g., 3′ monophosphates generated from the action of RNase A). Further, 5′ exoribonuclease (XRN; e.g., Saccharomyces cerevisiae Xrn I exoribonuclease) digests 5′-monophosphorylated RNA to mononucleotides, but generally does not digest RNA that has a 5′ triphosphate, 5′ cap, or 5′ hydroxyl group.
The reaction specificity of RNA ligase can also be a useful tool to discriminate between RNA molecules that have different 5′ end groups. This enzyme catalyzes phosphodiester bond formation specifically between a 5′ monophosphate in a donor RNA and a 3′-hydroxyl group in an acceptor oligonucleotide (e.g., an RNA acceptor oligonucleotide). Thus, RNAs that have a monophosphate group on their 5′ ends, whether present in a sample or obtained by treatment of 5′-triphosphorylated or 5′-capped RNA with TAP, are donor substrates for ligation to an acceptor nucleic acid that has a 3′ hydroxyl group using RNA ligase. RNA molecules that contain triphosphate, diphosphate, hydroxyl or capped 5′ end groups do not function as donor molecules for RNA ligase (e.g., T4 RNA ligase, EPICENTRE, or bacteriophage TS2126 RNA ligase). Thus, RNAs that have a hydroxyl group on their 5′ ends, whether present in a sample or obtained by treatment with AP, cannot serve as donor substrates for RNA ligase (e.g., T4 RNA ligase, EPICENTRE, or bacteriophage TS2126 RNA ligase). Similarly, RNA molecules that contain a 3′-terminal blocked group (e.g., RNA molecules that have a 3′-phosphate group or a 3′-beta-methoxyphenylphosphate group) do not function as acceptor substrates for RNA ligase.
Numerous publications disclose use of alkaline phosphatase (AP), tobacco acid pyrophosphatase (TAP), and RNA ligase to manipulate m7G-capped eukaryotic mRNAs using so-called “oligo capping methods.” For example, oligo capping methods and their use are disclosed in: World Patent Applications WO0104286; and WO 2007/117039 A1; U.S. Pat. No. 5,597,713; Suzuki, Y et al., Gene 200: 149-156, 1997; Suzuki, Y and Sugano, S, Methods in Molecular Biology, 175: 143-153, 2001, ed. by Starkey, M P and Elaswarapu, R, Humana Press, Totowa, N.J.; Fromont-Racine, M et al., Nucleic Acids Res. 21: 1683-4, 1993; and in Maruyama, K and Sugano, S, Gene 138: 171-174, 1994.
In those oligo capping methods, total eukaryotic RNA or isolated polyadenylated RNA is first treated with AP and then the AP is inactivated or removed. The AP converts RNA that has a 5′ triphosphate (e.g., uncapped primary RNA) and RNA that has a 5′ monophosphate to RNA that has a 5′ hydroxyl. The sample is then treated with TAP, which converts the 5′-capped eukaryotic mRNA to mRNA that has a 5′ monophosphate. The resulting 5′-monophosphorylated mRNA is then “oligo-capped” (or “5′ ligation tagged”) with an acceptor oligonucleotide using RNA ligase. The “oligo-capped” mRNA that has a “tag” joined to its 5′ end in turn serves as a template for synthesis of first-strand cDNA that has a tag joined to its 3′ end. Then, double-stranded cDNA can be made using a second-strand cDNA synthesis primer that is complementary to the tag joined to the 3′ end of the first-strand cDNA, and the resulting double-stranded cDNA can be used (e.g., to generate a full-length cDNA library). Oligo capping methods in the art are useful for 5′ ligation tagging of m7G-capped RNA, for making full-length first-strand cDNA using the 5′-ligation-tagged RNA as a template, for making full-length double-stranded cDNA (including full-length cDNA libraries), and for identification of the 5′ ends of eukaryotic mRNA (e.g., by sequencing or methods such as random amplification of cDNA ends (5′ RACE).
However, one problem with the oligo capping and other methods presently in the art is that the AP step converts the 5′ ends of all RNA molecules that have a 5′ triphosphate or a 5′ monophosphate group to a 5′ hydroxyl group (e.g., see FIG. 2 of World Patent Applications WO0104286). Thus, although the AP step is beneficial for some applications because it results in dephosphorylation of 5′-monophosphorylated RNA molecules (e.g., miRNA) so they cannot serve as donors for ligation to the acceptor oligonucleotide by RNA ligase, the AP step also results in dephosphorylation of uncapped mRNA molecules and uncapped non-coding primary RNA molecules (which may have functional significance) so they cannot serve as a donors for ligation to the acceptor oligonucleotide. What is needed in the art are methods for selectively 5′ ligation tagging 5′-triphosphorylated uncapped RNA molecules, such as uncapped mRNA and non-coding primary RNA, in the sample, and for converting said 5′-ligation-tagged RNA molecules to cDNA, without also 5′ ligation tagging 5′-monophosphorylated RNA molecules in the sample.
In addition, what is needed in the art are methods for selectively dephosphorylating those RNA molecules in a sample that have a 5′ monophosphate group without also removing the 5′ triphosphate group from primary RNA transcripts. What is needed are methods, compositions, and kits that employ an enzyme composition that is capable of selectively digesting a 5′ monophosphate group of undesired RNA to a 5′ hydroxyl group so that the undesired RNA will not be 5′ ligation tagged by the acceptor oligonucleotide. Thus, what is needed are methods, compositions, and kits that employ an RNA 5′ monophosphatase enzyme composition.
Still further, although the methods known in the art can be used for selective 5′ ligation tagging of m7G-capped RNA molecules, there is currently no good method in the art for selective 5′ ligation tagging of only uncapped primary RNA molecules in a sample that also contains capped RNA molecules. This is regrettable because it would be desirable to specifically oligo cap (or “5′ ligation tag”) and study the uncapped eukaryotic primary RNAs that are believed to play a role in cellular biological activities, including regulation of gene expression. What is further needed in the art is a method for selective 5′ ligation tagging of uncapped eukaryotic primary RNA molecules in a sample that also contains capped eukaryotic RNA molecules.
It is further regrettable that there is currently no good method in the art for selective 5′ ligation tagging of only uncapped primary RNA molecules in samples that also contain capped RNA molecules because, in general, bacterial mRNA molecules are not capped. Thus, it is difficult to study the expression of genes of pathogenic (e.g., mycoplasma) or symbiotic (e.g., Rhizobium) prokaryotes that are associated with eukaryotic cells. What is needed in the art are methods for selective 5′ ligation tagging of 5′-polyphosphorylated RNA of prokaryotes, including uncapped primary RNA molecules of bacteria or mycoplasma that are present or associated with eukaryotic cells, such as pathogenic or symbiotic prokaryotes in association with eukaryotic cells, without also 5′ ligation tagging capped eukaryotic mRNA molecules (e.g., to study prokaryotic gene expression during pathogenic or symbiotic processes).
What is further needed in the art are methods for selective 5′ ligation tagging of primary prokaryotic RNA molecules in samples from diverse environments (e.g., from soils, oceans, lakes, rivers, and other environments, including those with different or extreme conditions of temperature, pH, content of elements or chemicals, or other properties) in order to obtain, identify, characterize, clone, express, study, and exploit those RNA molecules for practical purposes (e.g., for identifying RNA transcripts to express enzymes or proteins with medical or industrial applications). By way of example, what is needed are 5′ ligation tagging methods that are easier, more efficient and that provide more and better data for metatranscriptomic surveys and research than methods known in the art (e.g., the methods described by J. Frias-Lopez et al., Proc. Natl. Acad. Sci. USA 105: 3805-3810, 2008).
Thus, what is needed in the art are methods for selective 5′ ligation tagging of desired RNA molecules without also 5′ ligation tagging undesired RNA molecules in the sample (e.g., for selective 5′ ligation tagging of uncapped primary RNA molecules but not capped RNA molecules in samples that contain both uncapped and capped RNA).
Prior to the present invention, no methods were known in the art for using an enzyme that would selectively digest the 5′ triphosphate of primary RNA, such as uncapped eukaryotic primary RNA or bacterial mRNA, to a 5′ monophosphate without also digesting capped eukaryotic mRNA. Thus, oligo capping methods known in the art could not be used for selectively synthesizing cDNA from uncapped eukaryotic primary RNA and/or full-length prokaryotic mRNA, for cloning cDNA prepared from uncapped full-length eukaryotic primary RNA and/or prokaryotic mRNA, for RNA amplification of uncapped full-length eukaryotic primary RNA and/or prokaryotic mRNA, or for capture and identification of the exact 5′ ends of uncapped full-length eukaryotic primary RNA and/or prokaryotic primary mRNA in samples that also contained capped RNA molecules. What is needed in the art are methods, compositions, and kits that employ an enzyme composition that is capable of digesting a 5′ triphosphate group of an uncapped primary RNA to a monophosphate under conditions wherein said enzyme composition does not digest the 5′ end of RNA that is capped. Thus, what is needed are methods, compositions, and kits that employ an RNA 5′ polyphosphatase enzyme composition.
What is needed in the art are methods, compositions, and kits that employ an RNA 5′ polyphosphatase enzyme composition and/or an RNA 5′ monophosphatase enzyme composition, including in combination with one or more other enzymes known in the art, for 5′ ligation tagging of any desired population of RNA molecules with an acceptor oligonucleotide using RNA ligase, for synthesizing cDNA from full-length desired RNA (e.g., full-length capped eukaryotic RNA, full-length uncapped eukaryotic primary RNA, and/or full-length prokaryotic primary mRNA) and for cloning said cDNA, for RNA amplification of said desired RNA, and for capture and identification of the exact 5′ ends of said desired RNA (e.g., by sequencing, or by using methods such as random amplification of cDNA ends (RACE), exon arrays, or other microarrays). What is needed are better and more efficient methods for making tagged DNA fragments from specific types of RNA molecules in samples for use in nucleic acid amplification, for making labeled target for expression analysis (e.g., using microarrays or qPCR) and for use as templates for next-generation sequencing.