Selective amplification of cDNA's represents a major research goal of molecular biologists, with particular importance in diagnostic and forensic applications, as well as for general manipulations or genetic materials.
In many important areas of research, such as in studying gene regulation in complex biological systems (e.g., the brain) having multiple phenotypes, obtaining sufficient mRNA for the isolation, cloning, and characterization of specific regulated transcripts is problematic. Research has been hindered by, e.g., the high complexity of the mRNA, the relatively low abundance of many important expressed messages, and the spatially limited expression of these messages. In particular, the identification and cloning of novel regulated messages from discrete cell populations has proven to be a formidable task.
The polymerase chain reaction (PCR) is an extremely powerful technique for amplifying specific nucleic acid sequences, including mRNA in certain circumstances. As described in U.S. Pat. No. 4,683,202 and U.S. Pat. No. 4,683,195 (both of which are incorporated herein by reference), PCR typically comprises treating separate complementary strands of a target nucleic acid with two oligonucleotide primers to form complementary primer extension products on both strands that act as templates for synthesizing copies of the desired nucleic acid sequences. By repeating the separation and synthesis steps in an automated system, essentially exponential duplication of the target sequences can be achieved.
PCR, however, has several well-known imitations. For example, PCR typically requires that terminus and terminus sequence information be known for the synthesis of the primers. Recently, homopolymeric tailing of the 3′ terminus (see Frohman, et al., Proc. Natl. Acad. Sci. USA 85: 3998–9002 (1988) and Eberwine et al., Neuroscience Short Course I (Society for Neuroscience) 69–81 (1988)) and the synthesis of highly degenerate nucleotide primers (Gould et al., Proc. Natl. Acad. Sci. USA 86: 1934–1938 (1989)) have been implemented to improve the range of cDNAs that can be cloned with PCR. An additional problem is the low fidelity of the most Widely used enzyme in PCR, Thermus aquaticus (Taq) polymerase. This characteristic of Taq results in misincorporations that are propagated through the subsequent cycles of PCR amplification—ultimately producing faulty cDNA libraries. Also, sequences longer than 3 kilobases create difficulties in Taq transcription, which can skew cDNAs to smaller sizes during amplification. Of course, unless modified, PCR provides amplification by DNA replication and not by transcription.
In this regard, Sarkar et al., Science 244: 31–34 (1989), recently described a method, called RAWTS (RNA amplification with transcript sequencing), for detecting extremely low abundance messages in heterologous cell types. This method is a modification of GAWTS (genomic amplification with transcript sequencing see, Stofler, et al., Science 339: 491 (1988)), which incorporates a phage promoter into at least one of the two primers used in PCR. In RAWTS, mRNA is amplified by PCR. A phage promoter incorporated into the PCR oligonucleotide primer allows abundant transcription, from which RNA can be sequenced directly.
Four steps are used in RAWTS: (1) first strand cDNA synthesis from total RNA or mRNA using oligo(dT) or an mRNA-specific oligo primer, dNTPs, and reverse transcriptase; (2) PCR, wherein one or both primers contain a T7 phage promoter attached to a sequence complementary to the region to be amplified; (3) transcription of the cDNA strand with T7 RNA polymerase; and (4) reverse transcriptase-mediated dideoxy sequencing of the resultant mRNA transcript.
In spite of such recent advances, including PCR and its various modifications noted above, there exists a need for improved methods of identifying and cloning mRNAs and of accurate in vitro amplification of selected cDNA's. The methods should produce about 100-fold or more amplification of heterogeneous populations of RNA from limited quantities of cDNA. Preferably, the overall methodologies will be capable of replicating a broad range of messages without prior cloning into vectors and without knowledge of sequence in some instances. The present invention fulfills these and other needs.