Since the discovery that genes, the hereditary material, are made up of nucleic acids (McCarthy, M., Nature 421 (2003) 406) and that genetic alterations are a molecular basis of disease (Guttmacher, A. E. and Collins, F. S., N. Engl. J. Med. 347 (2002) 1512-1520) and evolution (Ayala, F. J., Proc. Natl. Acad. Sci. USA 104, Suppl. 1 (2007) 8567-8573) nucleic acids became prominent target molecules of investigation. The most powerful and versatile methods for the investigation of nucleic acids on the genomic scale are microarrays (Brown, P. O. and Botstein, D., Nat. Genet. 21 (1999) 33-37) and high-throughput sequencer of the second or third generation (Shendure, J. and Ji, H., Nat. Biotechnol. 26 (2008) 1135-1145). These techniques usually need microgram amounts of nucleic acids for analysis, which corresponds to hundreds of thousands of mammalian cells (Peano, C. et al., Expert Rev. Mol. Diagn. 6 (2006) 465-480; Tang, F. et al., Nat. Methods 6 (2009) 377-382).
However, under many important conditions, it is practically impossible to get such large amounts of material. For example, techniques used to isolate human tissues, such as biopsy, fine-needle aspiration, cytolavage and laser capture microdissection, often achieve yields of extracted nucleic acids in the nanogram range (Kamme, F. et al., Methods Mol. Med. 99 (2004) 215-223). Other examples are coming form the fields of development studies, embryo cells, neuron, immune cell, cancer cell or stem cell research (Saitou, M. et al., Nature 418 (2002) 293-300; Chambers, I. et al., Nature 450 (2007) 1230-1234; Toyooka, Y. et al., Development 135 (2008) 909-918; Kamme, F. et al., J. Neurosci. 23 (2003) 3607-3615; Stoecklein, N. H. et al., Cancer Cell 13 (2008) 441-453; Diercks, A. et al., PLoS One 4 (2009) e6326). In fact, during mouse early development, when the founder population of germline, primordial germ cells have just emerged, there are only around 30 primordial germ cells in the embryo (Saitou, M. et al., Nature 418 (2002) 293-300). Even for in vitro-cultured stem cells, for which the number of cells would appear to be unlimited, there are serious limitations due to stem cell heterogeneity. For example, mouse embryonic stem cells, probably the most thoroughly analyzed type of stem cells, contain multiple subpopulations with strong differences in both gene expression and physiological function, which in turn promotes the need of genomic analysis on the level of subpopulations or even single cells (Chambers, I. et al., Nature 450 (2007) 1230-1234; Toyooka, Y. et al., Development 135 (2008) 909-918).
Therefore, in order to overcome the limitations of array and high-throughput sequencing technologies and to permit multiple analyses of even a single cell, the development of methods is needed to amplify few amounts of nucleic acid, without significantly distorting the information content of the sample. In this respect, many protocols for nucleic acid amplification of the whole genome as well as of the whole transcriptome have been developed in the last 20 years (Peano, C. et al., Expert Rev. Mol. Diagn. 6 (2006) 465-480; Lasken, R. S. and Egholm, M., Trends Biotechnol. 21 (2003) 531-535). Most of these methods are based upon in vitro transcription reaction, upon isothermal amplification and upon PCR (polymerase chain reaction).
The in vitro transcription method developed by Van Gelder and Eberwine (Van Gelder, R. N. et al., Proc. Natl. Acad. Sci. USA 87 (1990) 1663-1667) enables the linear amplification of RNA. The original method and their technical revisions are based on double stranded cDNA synthesis followed by RNA synthesis. The error rate of in vitro transcription is relatively low, not due the error rate of RNA polymerases (one mismatch for every 10 000 bases of synthesis), but because the input double stranded DNA templates are the only source of template for the complete amplification and, therefore, any errors created on the newly synthesized RNA will not be carried or amplified in the following reactions (Wang, E., J. Transl. Med. 3 (2005) 1-11). In vitro transcription however is burdensome, restricted to RNA samples, generates less stable RNA amplificates and it is time consuming. Furthermore the method is prone to produce a 3′ bias introduced by the use of promoter-modified oligo(dT) primer and especially when two rounds of amplification are employed, because the second-round RNA population will be smaller leading to a loss of information in the 5′ end of the transcript (Peano, C. et al., Expert Rev. Mol. Diagn. 6 (2006) 465-480; Wang, E., J. Transl. Med. 3 (2005) 1-11).
Most of the isothermal amplification methods are based upon the strand-displacement amplification approach, which relies on DNA polymerases with strong strand displacement activity, such as for example exo-Klenow, Bca, Bst or phi29 DNA polymerases (Dean, F. B. et al., Proc. Natl. Acad. Sci. USA 99 (2002) 5261-5266; Walker, G. T. et al., Proc. Natl. Acad. Sci. USA 89 (1992) 392-396; Kurn, N. et al., Clin. Chem. 51 (2005) 1973-1981). Priming sites for these polymerases are initiated by nick generating restriction enzymes or by random oligonucleotide primers. The unique properties of this reaction allow repeated DNA synthesis over the same template at 30° C., with each new copy displacing previously made copies. Therefore sophisticated instrumentation, like a thermocycler, is not necessary. Furthermore, especially the phi29 DNA polymerase exhibits a robust ability to replicate through difficult sequence as well as an extensive processivity by 10-100 kb at relatively low error rates (1 error every 106-107 bases) (Dean, F. B. et al., Proc. Natl. Acad. Sci. USA 99 (2002) 5261-5266; Esteban, J. A. et al., J. Biol. Chem. 268 (1993) 2719-2726). However the previously described isothermal amplification methods have drawbacks. Strand-displacement amplification methods such as by Walker, G. T. et al. (Proc. Natl. Acad. Sci. USA 89 (1992) 392-396) require the presence of sites for defined restriction enzymes, which limits its applicability. Randomly primed strand-displacement amplification methods such as by Dean or Kurn et al. (Dean, F. B. et al., Proc. Natl. Acad. Sci. USA 99 (2002) 5261-5266; Kurn, N. et al., Clin. Chem. 51 (2005) 1973-1981) are challenged if they yield products that are non-biased and if they are an accurate and even replication of the original sequence.
PCR mediated exponential amplification developed by Mullis (Mullis K. et al., Cold Spring Harb. Symp. Quant. Biol. 51 Pt. 1 (1986) 263-273) offers many advantages, such as high amplification yields that suggest the possibility of greatly reducing the amount of input material, together with fast and easy protocols that can drastically reduce the costs of analyses, thus enabling more complex experimental designs. Moreover, double-stranded PCR products are particularly stable. In addition to conventional PCR amplification techniques, methods for performing PCR in emulsion droplets are known in the art (EP 1 482 036; Williams, R. et al., Nat. Methods 3 (2006) 545-550).
However the PCR technology suffers from several drawbacks. First, PCR amplifies small regions of a few hundred nucleotides most efficiently, while, when larger regions are targeted, there is a decrease in the level of amplification. In this way, shorter fragments tend to be amplified in preference to larger ones. Second, amplification of genomic libraries, cDNA libraries and other complex mixtures of genes by PCR suffers from artifactual fragments that are generated by recombination between homologous regions of DNA. Recombination in this case occurs when a primer is partially extended on one template during one cycle of PCR and further extended on another template during a later cycle. Thus, chimeric molecules are generated, the short ones of which are then preferentially amplified (Williams, R. et al., Nat. Methods 3 (2006) 545-550; Meyerhans, A. et al., Nucleic Acids Res. 18 (1990) 1687-1691). Third, supplementary problems in the quality of the amplified nucleic acid sequences originated from the use of Thermus aquaticus (Taq) DNA polymerase, which is characterized by a relatively low fidelity. The Taq polymerase error rate (at best, one mismatch for every 50 000 bases of synthesis) results in the incorporation of several erroneous bases in most of the PCR-amplified DNAs (Lundberg, K. S. et al., Gene 108 (1991) 1-6). These misincorporations are propagated through subsequent cycles of the amplification. Fourth, another question concerns the loss of the proportionality of the amplification process. The exponential PCR reaction reaches saturation when excess input template quantities are used, thus favoring the amplification of high abundant over low abundant transcripts. Furthermore the DNA polymerase has low efficiency in the amplification of GC rich sequences as apposed to AT rich sequences (Wang, E., J. Transl. Med. 3 (2005) 28). The different amplification efficiencies can potentially result in several thousand-fold differential representation of DNAs in the DNAs population after as few as 30 cycles of amplification.
In summary the general properties and disadvantages of the current protocols for nucleic acid amplification show that there is a need for improved nucleic acid amplification methods. In particular, there is a requirement for unbiased pre-amplification when material from only a single or only a few cells is available. In this context, the present invention provided herein fulfills this need, overcomes several drawbacks and provides additional benefits.