The discovery of microRNAs (miRNAs) and other short RNAs such as small interfering RNAs (siRNA), and short non-coding RNAs (snRNA) has led to a rapid expansion of research elucidating their expression and diverse biological functions.
Recent studies have shown that distinct expression patterns of miRNAs are associated with specific types of cancer and certain other diseases, suggesting that miRNAs could represent a new class of biomarkers and prognostic indicators (Zhang and Farwell 2008). Good biomarkers can facilitate earlier diagnosis, which typically leads to better treatment outcomes.
The ability to distinguish members of small RNA families, such as miRNA isoforms, which differ by single nucleotide polymorphisms, or miRNA isomirs, which differ by nucleotide additions or deletions at the ends, is an important requirement for a successful platform for miRNA-based diagnostics or for monitoring disease progression or response to therapy (Lee et al. 2010).
The majority of current methods for expression profiling (EP) of miRNAs have been adapted from previously established assays for messenger RNAs (mRNAs) with modifications that accommodate the differences between mRNA and miRNA. MiRNAs are much smaller than mRNAs, have 5′-phosphate (5′-p) and 3′-hydroxyl (3′-OH) ends, and are not polyadenylated. Small RNAs that have different ends from miRNAs can be enzymatically converted to 5′-p and 3′-OH ends in order to apply the same methods of analysis as for miRNAs (Lamm et al. 2011; McCormick et al. 2011). Moreover, long coding and non-coding RNAs may be cleaved into smaller fragments and analyzed similarly to miRNAs, including by RNA sequencing (RNA-seq) methods (Lamm et al. 2011; McCormick et al. 2011). Therefore, the methods described herein for miRNAs are also applicable for other small RNAs as well as for fragments of large RNAs.
Sequencing, which obviously does not require prior knowledge of the RNA sequence, is the only method of RNA analysis that allows discovery of new miRNAs (as well as other naturally occurring RNAs). Sequencing methods can also reveal expression profiles for miRNAs through the frequencies with which individual sequences appear (digital gene expression, DGE) (Linsen et al. 2009). For already known miRNA sequences, expression profiling can also be accomplished by other methods, such as microarrays and RT-PCR, which currently are the standard methods for expression profiling (EP) and molecular diagnostics (Blow 2009; Willenbrock et al. 2009; Benes and Castoldi 2010).
Nevertheless, next generation sequencing (NGS) is increasingly viewed as the future of expression profiling and molecular diagnostics (Su et al. 2011). The NGS methods are good candidates for these jobs, because they combine unlimited multiplexing capability, single-molecule sensitivity, essentially unlimited dynamic range, and unparalleled sequence specificity. NGS provides expression profiles for all miRNAs through the relative frequencies with which individual sequences appear and uses the global mean normalization, which is more accurate than normalization methods using limited numbers of stably expressed small RNA (Mestdagh et al. 2009). Specialized NGS methods have the potential to replace both arrays and RT-qPCR. However, current NGS methods are not suitable for routine miRNA expression profiling and diagnostic assays, primarily because of their high cost and the need for laborious, time-consuming procedures for preparing sequencing libraries. These procedures also include mandatory gel-purification, extraction and ethanol precipitation steps that may significantly affect the accuracy of miRNA quantification (McDonald et al. 2011). Moreover, current NGS methods are not selective for specific miRNA sequences of interest. Therefore, number of sequencing reads for specific miRNA biomarkers can be insignificant due to overwhelming numbers of unrelated sequencing reads.
Knowledge of the absolute and relative expression of miRNAs is important for understanding the biogenesis of miRNAs, regulation of biochemical pathways by miRNAs, and identification of miRNA biomarkers. For a given set of miRNAs, differences in abundance determined for different samples (e.g., differences in miR (miRNA)-16 levels between healthy and diseased tissue) determined by NGS, arrays and RT-qPCR methods are in good correlation. However, the absolute copy numbers of individual miRNAs, as well as the relative copy numbers of various miRNAs detected within the same samples, do not correlate well when determined by the various methods, because each method has its own sequence-associated biases (Nelson et al. 2008; Bissels et al. 2009; Linsen et al. 2009; Git et al. 2010; Lee et al. 2010; Tian et al. 2010). All these methods would significantly benefit from improvements that reduce cost and increase the accuracy of expression profiling of miRNAs of interest.
The present invention addresses these issues.