1. Field of the Invention
The present invention relates to the field of molecular biology. More particularly, the present invention relates to detection of microRNA (miRNA) molecules using homopolymeric tailing, reverse transcription and amplification.
2. Description of Related Art
MicroRNA (miRNA) are small non-protein coding RNA molecules that are endogenously expressed in eukaryotic organisms from fission yeasts to higher organisms. They regulate expression of up to 30% of all genes and play roles in cell differentiation, proliferation, apoptosis, anti-viral defense, and cancer. miRNA have tissue-specific and developmental-specific expression patterns. Thus, these small RNA molecules are of great interest in elucidation of biological processes, disease states, and development.
miRNA are expressed as pol II transcripts as relatively long RNA molecules called pri-miRNA. These pri-miRNA have a 5′ cap and a poly-A tail, like other RNA transcripts. The pri-miRNA form hairpin-loop structures in the nucleus, then the hairpin structure is cleaved at the base of the stem by nuclease RNA III Drosha to form double-stranded molecules referred to as pre-miRNA. The pre-miRNA are exported to the cytoplasm by exportin 5, where they are processed by cleavage by Dicer into short (17-25 nucleotide) double-stranded RNA molecules. The strand of the pre-miRNA with less 5′ stability then can become bound to the RNA interference silencing complex (RISC) and effect mRNA regulation by binding at the 3′ untranslated region (3′ UTR) of mRNA having homology to the miRNA (target mRNA) or by directing transport of the mRNA into bodies. Binding results in either cleavage of the target mRNA if there is 100% complementarity between the miRNA and the target RNA (RNA interference) or down-regulation of expression (without cleavage) by binding to the target mRNA and blocking translation or directing mRNA decay that is initiated by miRNA-guided rapid deadenylation if there is less than 100% complementarity between the miRNA and the target RNA. A useful resource for miRNA information is available from the Sanger Institute, which maintains a registry of miRNA at http:/microrna.sanger.ac.uk/sequences/. The miRBase Sequence database includes the nucleotide sequences and annotations of published miRNA from a variety of sources. The miRBase Registry provides unique names for novel miRNA genes that comply with conventional naming nomenclature for new miRNA prior to publication. The miRBase Targets is a resource for predicated miRNA targets in animals. The databases are updated frequently and thus provide a comprehensive source of useful miRNA nucleotide sequences.
miRNA have been found in both coding and non-coding sequences within the genome. They have also been found oriented in both the sense or anti-sense direction with regard to the particular gene in which they are located. Additionally, miRNA may be polycistronic wherein more than one miRNA is in a single mRNA transcript. Expression of miRNA in various cells has been estimated at less than 1,000 copies to more than 500,000 copies.
miRNA family gene expression is regulated spatially and temporally. To assist in the understanding of this regulation, many studies have examined rapidly-evolving Arabidopsis thaliana miRNA genes. These miRNA genes arose from a process of genome-wide duplication, tandem duplication, and segmental duplication followed by dispersal and diversification, in processes similar to those that drive the evolution of protein gene families. Multiple expression data sets were examined to study the transcription patterns of different members of the miRNA families. Changes in spatial and temporal expression patterns accompanied the sequence diversification of duplicated miRNA genes suggesting that duplicated copies acquire new functionality as they evolve (Maher, C., L. Stein, D. Ware. 2006. Evolution of Arabidopsis microRNA families through duplication events. Genome Res. 16:510-519).
Additional studies in Caenorhabditis elegans demonstrated that different miRNA family members may be involved in different physiological functions (Abbott, A. L., E. Alvarez-Saavedra, E. A. Miska, N. C. Lau, D. P. Bartel, H. R. Horvitz, V. Ambros. 2005. The let-7 MicroRNA family members mir-48, mir-84, and mir-241 function together to regulate developmental timing in Caenorhabditis elegans, Dev. Cell 9:403-414 and Leaman, D., P. Y. Chen, J. Fak, A. Yalcin, M. Pearce, U. Unnerstall, D. S. Marks, C. Sander, T. Tuschl, U. Gaul. 2005. Antisense mediated depletion reveals essential and specific functions of microRNAs in Drosophila development. Cell 121:1097-1108), further emphasizing the importance of discriminating between related miRNA family members.
A systematic evaluation determined the minimal requirements for functional miRNA-target duplexes in vivo (Brennecke, J., A. Start, R. B. Russell, S. M. Cohen. 2005. Principles of microRNA-target recognition. PloS Biol. 3(3):e85). In this study, target sites were grouped into two broad categories. In one category, the mRNA binding site has sufficient complementarity to the miRNA 5′ end to function with little or no support from pairing to the miRNA 3′ end. In the second category, strong 3′ pairing is required for function with inadequate 5′ pairing. Both sites are present in biologically relevant genes. Additionally, evidence is presented that an average miRNA has approximately 100 target sites, indicating that miRNAs regulate a large fraction of protein-coding genes and that miRNA 3′ ends are key determinants of target specificity within miRNA families. Thus, the differences in nucleotide sequence between the different miRNA family members may be critical to their biological function and the ability to distinguish between highly homologous plant, mammalian, and worm miRNA may be crucial to an accurate understanding of the biological processes regulated by these miRNA.
Studies have shown that differential miRNA expression occurs in cancerous and non-cancerous tissues. miRNA represent 1% of the mammalian genome but more than 50% of miRNA genes are located within regions associated with amplification, deletion and translocation in cancer. It is likely that miRNA present in regions where genomic DNA has been deleted or amplified will not be expressed or be expressed at higher than normal levels, respectively. The expression of miRNA present in regions that are translocated will be governed by the nucleotide sequences in the area that the sequences were translocated to rather than where they were derived.
Differential expression of miRNA in cancerous and normal cells in various solid tumors has been well established. (Lu J., G. Getz, E. A. Miska, E. Alvarez-Saavedra, J. Lamb, D. Peck, A. Sweet-Cordero, B. L. Ebert, R. H. Mak, A. A. Ferrando, J. R. Downing, T. Jacks, H. R. Horvitz, T. R. Golub. 2005. MicroRNA expression profiles classify human cancers. Nature 435:834-838 and Volinia, S., G. A. Calin, C-G. Liu, S. Ambs, A. Cimmino, F. Petrocca, R. Visone, M. Torio, C. Roldo, M. Ferracin, R. L. Prueitt, N. Yanaihara, G. Lanza, A. Scarpa, A. Vecchione, M. Negrini, C. C. Harris, C. M. Croce. 2006. A microRNA expression signature of human solid tumors defines cancer gene targets. Proc. Natl. Acad. Sci. USA 103; 2257-2261). Thus, detection of miRNA expression might be useful in diagnostics, including diagnosis of cancerous conditions. Additionally, miRNA expression might be useful in cancer prognosis and in determining the metastatic potential of tumors and thereby assist in identifying suitable adjuvant treatment.
As an example of highly homologous RNA, many miRNA are grouped into families based upon high sequence homology. In particular, nucleotide positions 2 through 7 from the 5′ end are generally 100% homologous. (Lewis, B. P., I-H. Shih, M. W. Jones-Rhoades, D. P. Bartel, C. B. Burge, 2003. Prediction of Mammalian MicroRNA Targets. Cell. 115:787-798). The remaining nucleotides may differ by as few as a single nucleotide. This nucleotide difference may occur at any other position in the miRNA.
The let-7 miRNA family has nine different members with highly homologous nucleotide sequences that may or may not have the same biological function. If they always have the same biological function, it may not be important to have an assay that is able to distinguish between them. While the study of the biological function of miRNA is just beginning, studies suggest that there are likely to be different biological functions associated with highly homologous miRNA.
It is widely known that cells constitutively express housekeeping or infrastructural RNAs. In addition, a wide variety of RNA participating in mechanisms involved in regulation of gene expression at all levels of transmission of genetic information from DNA to proteins are also expressed. The functional roles of noncoding RNA include chromatin structure remodeling, transcriptional and translational regulation of gene expression, regulation of protein function and subcellular distribution of RNA and proteins.
Noncoding transcripts have been identified in organisms belonging to all domains of life. These transcripts include microRNA, snoRNA, housekeeping (infrastructural) RNA (e.g. rRNA, tRNA, snRNA, SRP RNA), and tmRNA, Noncoding RNA include imprinted transcripts (e.g. H19 and Air), dosage compensation transcripts (e.g. Xist mammalian X-inactive specific transcript), stress response transcripts, pal III transcripts, and disease-associated transcripts. Many of the disease-associated transcripts are over expressed in cancers. A database of information on noncoding transcripts can be found at http:(doubleslash)ncrna(dot)rna(dot)net(dot)pl/Browser(dot)html.
RNA interference (RNAi) is an evolutionarily conserved process that functions to inhibit gene expression by means of 21-25 nucleotides of double-stranded RNA (dsRNA) known as small interfering RNA (siRNA) (Fire, A., S. Xu, M. K. Montgomery, S. A. Kostas, S. Driver, C. C. Mello. 1998. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature. 391:806-811). siRNA molecules have potential as therapeutic agents by specific inhibition of expression of preselected proteins and as targets for drugs that affect the activity of siRNA molecules that regulate proteins involved in a disease state. Thus, the ability to accurately quantitate siRNA molecules with high specificity is desirable in monitoring their presence in different cell types in an organism, in particular in response to a defined stimulus or disease state. siRNA are similar to miRNA in their length and composition.
While the methods herein describe the quantitation of miRNA, also contemplated is the application of these methods to the detection of any RNA that lacks a polyA tail or whose detection benefits from the primer design described herein (e.g., siRNA).
A number of techniques have been developed over the last 30 years to detect nucleic acids of interest. Such techniques include everything from basic hybridization of a labeled probe to a target sequence (e.g., Southern blotting) to quantitative polymerase chain reaction (QPCR) to detect two or more target sequences with detection probes or multiple amplification primers, respectively. The polymerase chain reaction (PCR) or more specifically QPCR, is now commonly used in techniques designed to identify small quantities of a target nucleic acid in a sample.
Various techniques have been developed to discover new miRNA and to attempt to quantitate known miRNA in samples or tissues. Many of the studies performed to date have focused on determining the relative levels of miRNA expression. In a common technique, inserts from miRNA are ligated into a vector or to adapter sequences and then the nucleotide sequence is determined. In other techniques, Northern blotting is used to identify expression of miRNA. In general, Northern blotting techniques for studies of miRNA include lysing a cell sample, enriching for low molecular weight RNA, generating a typical Northern blot, hybridizing to a labeled probe which is complementary to a miRNA of interest, and determining the relative molecular weights of detected species to gain a general understanding of the relative amounts of pri-miRNA, pre-miRNA, and miRNA in the original sample.
Studies using Northern blotting typically focus on detection and confirmation of expression of predicted miRNA, and often attempt to quantitate miRNA expression in samples, particularly to determine tissue and time point specific miRNA expression. Studies using Northern blotting have also been performed in attempts to determine ratios of pri-miRNA, pre-miRNA, and miRNA in samples.
In silico predictions are widely used to discover novel miRNA that may be expressed. Computer algorithms have been developed and implemented to identify new miRNA. These in silico methods generally include scanning an organism's genome for sequences that have the potential to form hairpins. Sequences that are identified are then scanned for complementarity to 3′ UTR and compared to known homologs. Potential targets are then confirmed by bench experiments, such as through Northern blot experiments.
Microarrays have also been used to detect and measure the relative expression levels of miRNA. In general, microarray methods include spotting oligonucleotides that are complementary to known miRNA sequences on an array, generating fluorescence-labeled miRNA, and exposing the labeled miRNA to the array to determine if any miRNA of interest are present. Microarrays have been used to validate predicted miRNA, to discover homologs of known miRNA, to identify and monitor expression of a given miRNA in a tissue and/or over a time course, and to study miRNA processing.
The detection of miRNA presents unique challenges because of the short template length (19-24 nucleotides), varying G:C content between different miRNA and within the same miRNA nucleotide sequence, and high sequence homology between closely-related family members.
An ideal method of RNA quantitation comprises the following characteristics: high specificity to discriminate between RNA of high nucleotide sequence homology, high sensitivity to detect RNA of varying abundance, linear detection over a broad range of RNA copy numbers, compatible with RNA from a variety of sources (cell lysates, total RNA, samples enriched for small RNA, RNA isolated from FFPE tissue samples), uses the same reaction conditions to detect all RNA to allow for high-throughput and ease of use, and allows for the detection of various classes of RNA in the same sample.
To overcome the short length of miRNA, several methods exist to add additional sequence to a miRNA to facilitate priming and detection. In particular, many add a common sequence to every miRNA to allow for use of a single universal extension primer. In particular, QPCR-based miRNA detection methods add additional sequence to the miRNA to increase its length prior to or during reverse transcription. These methods include the use of a linker primer (Chen, C., D. Ridzon, Z. Zhou, K. Q. Lao, and N. A. Strauss. Methods, Compositions, and Kits Comprising Linker Probes for Quantifying Polynucleotides. U.S. patent application publication number 2006/0078906), use of a RT adapter (Raymond, C. K., B. S. Roberts. P. Garrrett-Engele, L. P. Lim, J. M Johnson. 2005. Simple-quantitative primer-extension PCR assay for direct monitoring of microRNAs and short-interfering RNAs. RNA. 11:1737-1744 and Raymond, C. K. Methods for Quantitating Small RNA Molecules. WIPO patent application PCT/US2006/002591 and direct ligation of an RNA molecule to the 3′ end of miRNA (Jacobsen, N., L. Kongsbak, S. Kauppinen, S. M. Echwald, Mouritzen, P. S. Nielsen, M. Norholm. U.S. patent application publication number 2005/0272075). In Sorge, J. A. and R. L. Mullinax (U.S. patent application publication number 2006/0211000), Yeakley, J. M. (Methods and Compositions for Detection of Small Interfering RNA and Micro-RNA), and U.S. patent application publication number 2006/0019258, additional sequence is added by annealing two DNA molecules that exceed the length of the miRNA to a preselected miRNA and ligating to form a long DNA template. In addition, Yeakley generates a labeled miRNA by the use of a phosphate reactive reagent having a label moiety.
In similar methods, additional sequence is added by polyadenylation of the miRNA using Escherichia coli polyA polymerase (Shanfa, L., Y.-H. Sun, R. Shi, C. Clark, L. Li, V. L. Chiang. 2005. Novel and Mechanical Stress-Responsive MicroRNAs in Populus trichocarpa that Are Absent from Arabidopsis. Plant Cell. 17(8): 2186-2203; and Lu, R., V. L. Chiang. 2005. Facile means for quantifying microRNA expression by real-time PCR. 2005. BioTechniques. 39(4):519-25). An anchored oligo dT primer is annealed to the polyA tail and extended in reverse transcription reaction.
Thus, a short RNA template can be increased in length by the addition of sequence in reverse transcription, ligation, or polyadenylation reactions. While these methods provides a means of increasing the length of the template for QPCR and providing a universal priming site for the reverse primer in QPCR, this addition does not improve the efficiency or specificity of priming in the region of the miRNA.
PCR primer design is critical for the sensitive and specific detection of a target molecule. To this end, many rules have been established and many computer programs which employ these rules are available. These basic rules include ensuring that the melting temperature (Tm) of the PCR primers is above that of the temperature employed in the annealing reaction of the PCR and below that of the temperature employed in the extension reaction of the PCR. While Tm calculations which employ the nearest neighbor rules (Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31(13): 3406-3415 and Mathews, D. H., J. Sabina, M. Zuker and D. H. Turner. 1999. Expanded Sequence Dependence of Thermodynamic Parameters Improves Prediction of RNA Secondary Structure. J. Mol. Biol. 288, 911-940) are generally more accurate in predicting the actual Tm, empirical testing is still recommended in initial testing to identify optimal reaction conditions.
Specifically, Shanfa, L. and Lu, R. (above) teach designing primers for the detection of miRNA by designing a forward primer based on the entire miRNA sequence being detected. The guideline given is if the forward primers contains more than three G/C within the five 3′-end nucleotides, one or two adenines are added to the 3′ end of these primers to ensure their binding to the target site encompassing the miRNA sequence and thymines in the poly(T) adapter (Lu, R., above). According to Lu, the true melting temperature (Tm) of potential primers was then determined experimentally by annealing the primer to its complement and generating a thermal dissociation curve between 45° C. and 95° C. Lu, R. (above) used this primer design method to distinguish between miRNAs differing by two or more nucleotide sequences using a standard protocol and by one or more nucleotides using a high-stringency amplification. This amplification protocol was developed by annealing templates with no mismatch or one or more mismatches to the PCR primer and identifying a temperature at which all primers with mismatches have a Tm that is lower and the perfectly matched primer has a Tm that is 5° C. higher. While this method may have high specificity, it requires excessive experimentation to evaluate every primer to determine its optimal Tm and results in protocols that require separate amplifications to detect miRNA of different Tms. In addition, the mismatches that were tested did not include G:T or T:T mismatches, which are known to be more difficult to discriminate. Thus, varying annealing and/or extension temperatures alone may not result in single nucleotide discrimination in all cases.
Alternatively, modified nucleotides have been used to develop reagents to detect miRNA in microarray-(Castoldi, M., S. Schmidt, V. Benes, M. Noerholm, A. E. Kulozik, M. W. Hentze, M. U. Muckenthaler. 2006. A sensitive array for microRNA expression profiling (miChip) based on locked nucleic acids (LNA). RNA. 12(5):913-920) and QPCR-based (Raymond, C. K., B. S. Roberts. P. Garrrett-Engele, L. P. Lim, J. M. Johnson. 2005. Simple-quantitative primer-extension PCR assay for direct monitoring of microRNAs and short-interfering RNAs. RNA. 11:1737-1744 and Raymond, C. K. Methods for Quantitating Small RNA Molecules. WIPO patent application PCT/US2006/002591) assays. In these methods, a modified nucleotide, such as locked nucleic acid (LNA) that possess a 2′-O,4′-C methylene bridge in the ribose moiety of the nucleotide (Petersen, M., J. Wengel. 2003. LNA: A versatile tool for therapeutics and genomics. Trends Biotechnol. 21:74-81) is incorporated into a detection molecule. The use of an LNA increases the hybridization affinity of the oligonucleotides that contain LNA bases and are therefore included in the miRNA-specific detection reagents. In Castoldi, et al, the capture molecules on the microarray include one or more LNA bases and in Raymond, et al, the miRNA-specific QPCR primers include one or more LNA-bases. While the microarray-based method was sensitive, the method was unable to distinguish between several of the closely-related let-7 family members with as high as 20% and 30% relative signal between labeled let-7e and let-7b miRNA and a let-7a capture probe, respectively (http://www.exiqon.com/SEEEMS/26.asp). In addition, while the QPCR-based method had high sensitivity and specificity, high background signal in the absence of template was also observed. This background signal was attributed to primer-dimer formation between the gene specific primer used in reverse transcription and the miRNA-specific primer which included the LNA. Thus, a method with higher specificity and lower background that more accurately quantitates the amount of miRNA in a test sample is desirable.
Non-templated nucleotide sequences are commonly added to a template during PCR in order to add nucleotide sequences for cloning (e.g., restriction endonuclease recognition sites), add nucleotides encoding a protein used in purification of the protein (e.g., a HIS tag), increase the Tm of the primer to improve priming efficiency in subsequent extension reactions, and to serve as a priming site in subsequent extension reactions.
An example of non-templated addition of nucleotide sequences includes those that occur during natural processes. In particular, retroviruses, for example Moloney Murine Leukemia Virus (MMLV), perform template switches during the synthesis of retroviral DNA. In this process, the MMLV RT begins DNA synthesis on one viral RNA and then switches to a different viral RNA template in a process called intermolecular template switching. Template switching occurs far more frequently between regions of high sequence homology than between non-homologous sequences (Hu W-S, H. M. Temin. 1990. Retroviral recombination and reverse transcription. Science. 250:1227-1233 and Zhang J., H. M. Temin. 1993. Rate and mechanism of non-homologous recombination during a single cycle of retroviral replication. Science. 259:234-238).
Similarly, a template switching mechanism utilizing a 7-methylguanosine CAP structure present on the 5′ ends of all eukaryotic mRNAs (U.S. Pat. Nos. 5,962,271 and 5,962,272) forms the basis of a method designed to clone full-length cDNA. In this method, a template switching oligonucleotide having one or more ribonucleotides, wherein at least one of which is GMP, at the 3′ end is used to generate full-length cDNA. In this method, an RNA sample is combined with a cDNA synthesis primer to allow annealing of the cDNA synthesis primer to mRNA to produce a primer-mRNA complex; the primer-mRNA complex is incubated under conditions that permit template-dependent extension of the primer to generate an mRNA-cDNA hybrid; and the mRNA-cDNA hybrid and template switching oligonucleotide are contacted under conditions that permit template-dependent extension of the cDNA of the hybrid, such that a 3′ end of the cDNA sequence comprises a sequence that is complementary to the template switching oligonucleotide. The added sequence most commonly served as a primer binding site in subsequent extension reactions. Thus, the non-template addition of Cs is used in the addition of the nucleotide sequence of the template switching oligonucleotide to the 3′ end of a newly synthesized cDNA during the reverse transcription reaction.
In later research, it was later found that MMLV RT adds a few non-templated nucleotides (primarily C) to the 3′ end of a newly synthesized cDNA strand upon reaching the 5′ end of the RNA template and was therefore not dependent upon the presence of a 7-methylguanosine CAP structure (Matz, M. D. Shagin, E. Bogdanova, O. Britanova, S. Lukyanov, L. Diatchenko, A. Chenchik. 1999. Amplification of cDNA ends based on template-switching effect and step-out PCR. Nucleic Acids Res. 15:27(6):1558-1560).
U.S. patent application publication number 2006/0051771 discloses a method for tailing and amplifying RNA. More specifically, it discloses a method for increasing the efficiency of tailing a targeted RNA in a sample, where the method comprises altering the secondary structure of the targeted RNA and incubating the targeted RNA in the presence of a tailing enzyme and a nucleotide under conditions that allow tailing of the targeted RNA. Exemplary methods for altering the secondary structure of the targeted RNA include denaturing the targeted RNA, such as by heating or adding a single strand binding protein.
While numerous techniques and reagents are available for detection and analysis of miRNAs, there still exists a need in the art for methods of miRNA detection that have high specificity and sensitivity, have low or no background signal in the absence of template, allow for the detection of hundreds of different miRNA from a single reverse transcription reaction, use the same reaction conditions to detect all RNA to allow for high-throughput and ease of use, and allow for the detection of various classes of RNA in the same sample.