The present invention relates generally to methods for generated stranded cDNA libraries.
As the complexities of gene regulation become better understood, a need for capturing additional data has emerged. Stranded information identifies from which of the two DNA strands a given RNA transcript was derived. This information can provide, for example, increased confidence in transcript annotation, transcript discovery and expression profiling. Additionally, identifying strand origin can increase the percentage of alignable reads, thereby reducing sequencing cost per sample. Maintaining strand orientation also allows identification of antisense expression, which is an important mediator of gene regulation. The ability to capture the reactive abundance of sense and antisense expression provides visibility to regulator interactions that might otherwise be missed.
Methods for determining mRNA sequences can involve analyzing the DNA sequence of single clones of a cDNA library, which can be derived by enzymatic production of double-stranded cDNA from the mRNA isolated from a target cell or population of cells. Methods for determining the relative abundance of mRNA species typically involve quantifying the hybridization of a defined nucleic acid sequence to a complementary sequence in the mRNA population. Analysis of samples containing a relatively low quantity of mRNA can involve amplification prior to the application of methods for determining the sequence or relative abundance, of particular mRNA species. One of ordinary skill also recognizes that amplification methods that proceed exponentially are more likely to introduce bias in the relative levels of different mRNAs.
Existing methods developed for amplification of nucleic acid molecules have their shortcomings. Some methods suffer from, for example, sequence bias during exponential amplification and inefficiency of single-stranded ligation, the narrow applicability to a few forms of RNA and DNA, and the requirement of a 5′-terminal CAP. Accordingly, there exists a need for methods that are capable of unbiased selection of stranded RNA sequences from an RNA sample. The present invention satisfies this need and provides related advantages.