In many important areas of research, particularly those involving complex biological systems, obtaining sufficient mRNA for expression analysis, alternative splicing and SNP variation is problematic. Limiting factors include the high complexity of the mRNA, the relatively low abundance of many important expressed messages, and the spatially limited expression of these messages.
A tool showing considerable promise for genetic analysis is the nucleic acid array, reviewed by Ramsay (1998) Nat. Biotech. 16:40-44. These arrays contain dense collections of nucleic acids, either PCR products or oligonucleotides, usually of known sequence, that have been either synthesized or printed at fixed spatial locations on suitable substrates, such as nylon filters or glass slides. When labeled DNA or RNA samples are hybridized to the arrays, the abundance of specific sequences in solution can be quantitated based on the fluorescent or radioactive signal intensity at the position of the complementary probe. However, the substantial amounts of labeled probes required, for example to hybridize to microarrays, makes it difficult to test small tissue samples and groups of isolated cells in such methods.
While amplification methods have been previously described, such methods suffer from a skewing of representation, where the end product does not reflect the distribution of species in the starting population. Prior art methods include T7 transcription of cDNA, as described by Van Gelder et al., U.S. Pat. No. 6,291,170; tissue culture of cells to increase the absolute quantity of RNA harvest; single primer isothermal amplification (Walker et al. (1992) P.N.A.S. 89(1):392-6); SMART cDNA amplification (Seth et al. (2003) J Biochem Biophys Methods. 55(1):53-66); single primer amplification (SPA) over a 40 cycle asymmetric amplification (Smith et al. (2003) Nucleic Acids Res. 31 (3):e9).
Lisitsyn et al. (1993) Science 259:946-951) introduced a PCR-based method referred to as representational difference analysis (RDA). RDA utilizes PCR to enrich for unique species in one of the samples after hybridization and polymerization steps. RDA uses two separate ligations of two different adaptors to enrich for unique species. After an initial PCR amplification of both tester and driver samples with a first adaptor, a second adaptor is attached to the ends of tester DNA but not the driver DNA. Then, after mixing the second adaptor-treated tester DNA with driver DNA, denaturing, hybridizing, and filling in overhanging ends, only double stranded tester DNA should amplify exponentially with PCR primers specific for the second adaptor sequences. The tester:driver hybrids should amplify linearly and the driver:driver hybrids should not amplify at all.
In order to be effective for cloning, RDA requires a reduced complexity in the starting material used. To reduce the complexity, RDA generally employs a digestion of total genomic DNA with a six base pair-cutting enzyme and amplifying the digested DNA by PCR. A high proportion of the digested fragments do not fall within the amplifiable range of 150-1000 base pairs. Larger fragments are not amplified, reducing the complexity of the amplicon so that the final representation contains only about 2-10% of the total genome. Of course, the representations of the PCR will not encompass the entire sequence information available in the genome. Consequently, desired sequences may not be represented in the subtracted library while undesired species may be represented in the subtracted library.
RDA was been applied to cDNA subtraction by Hubank and Schatz (1994) N.A.R. 22:5640-5648. The method is very similar to RDA described by Lisitsyn et al., with cDNA being used as the starting material instead of genomic DNA. As with RDA, there are two adaptor ligation steps. The method is designed so that only tester:tester hybrids contain the PCR primer binding sites on both ends of the strands of DNA, and thus are the only species that are exponentially amplified. In contrast to the complexity of genomic RDA, a population of cDNA derives from some 15,000different genes in a typical cell and represents only about 1-2% of the total genome. Therefore, RDA can apparently be applied to cDNA without the need to first reduce the complexity.
The hybridization goes to completion which allows the selection of rare sequences in the tester population. The more abundant driver population competes out the tester population through hybridization, which results in non-exponential products. Consequently, the more abundant nucleic acids in the tester population will have a higher probability of subsequent exponential amplification than the rare nucleic acids. The linearly amplified rare nucleic acids can effectively become lost from the amplified population.
There remains a need in the art for new and improved methods to amplify populations of nucleic acids, particularly mRNA populations, where there is a significant increase in the amount of testable material, with direct linear relationship to the starting population. By providing methods for non-preferentially replicating or amplifying nucleic acids, the disclosed invention fulfills those needs. The ability of this divergent population of molecules to be amplified efficiently provides a means of generating probes and libraries of expression signatures; and consequent insights into the biology of living systems.