Current procedures for preparation of samples for sequencing of total cellular RNA require converting the total RNA into a DNA library of molecules suitable for high throughput DNA sequencing. Purified RNA or RNA fragments are reverse transcribed into first strand cDNA using poly(dT) or random primers. This is followed by second strand cDNA synthesis in the presence of RNaseH and DNA polymerase I. The template is typically amplified prior to sequencing, for example using PCR, to obtain sufficient amounts of material for the sequencing reaction. Amplification may be performed prior to attachment of the DNA to a solid surface as in the Roche/454 system (Hoffman LaRoche). In some systems solid-phase amplification is used to produce randomly distributed, clonally amplified clusters of templates on a glass slide, with forward and reverse primers also covalently attached to the slide (Illumina/Solexa). Current sequencing methods are therefore capable of providing thousands of prepared template molecules immobilized on a solid surface or support, resulting in spatially separated template sites which allow thousands of sequencing reactions to be performed simultaneously.
Because of the requirements for amplification of the template and attachment to solid surfaces, the isolated RNA must be modified during sample preparation by appending sequences for use in amplification (e.g., PCR primer sites) and attachment (e.g., adapters specific for the sequencing system or reaction being used). These are typically cumbersome, multi-step processes that can take up to two days to complete. For example, the Illumina mRNA sample preparation procedure involves several steps to prepare the double-stranded cDNA for sequencing, all of which are performed prior to application to the solid support: end-repair, addition of an “A” base to the 3′ ends, ligation of adapters, purification of the ligation product and PCR amplification. The entire protocol requires two days to complete.
It is desirable in genome and transcriptome sequencing to be able to identify which strand of the double strand cDNA is being sequenced. Although the available sequencing systems typically append different adapter sequences to each end of the molecule during sample preparation, this does not produce a strand-specific or oriented library which would allow identification of the strand being sequenced.
There therefore exists a need for methods of preparing whole mRNA for sequencing that produce strand-specific libraries adaptable to a variety of sequencing instruments and systems and that reduce the time required for sample preparation. The present invention meets these needs.