This invention relates to methods for amplifying DNA sequences, including those selected in genome mismatch scanning procedures, through the use of rolling circle DNA amplification. Methods of the invention are useful in genotyping, phase determination, polymorphism analyses, mismatch scanning procedures, and general cloning procedures.
Rolling circle amplification (RCA) is an isothermal process for generating multiple copies of a sequence. In rolling circle DNA replication in vivo, a DNA polymerase extends a primer on a circular template (Komberg, A. and Baker, T. A. DNA Replication, W. H. Freeman, New York, 1991). The product consists of tandemly linked copies of the complementary sequence of the template. RCA is a method that has been adapted for use in vitro for DNA amplification (Fire, A. and Si-Qun Xu, Proc. Natl. Acad Sci. USA, 1995, 92:4641-4645; Lui, D., et al., J. Am. Chem. Soc., 1996, 118:1587-1594; Lizardi, P. M., et al., Nature Genetics, 1998, 19:225-232; U.S. Pat. No. 5,714,320 to Kool). RCA can also be used in a detection method using a probe called a xe2x80x9cpadlock probexe2x80x9d (WO Pat. Ap. Pub. 95/22623 to Landegren; Nilsson, M., et al. Nature Genetics, 1997, 16:252-255, and Nilsson, M., and Landegren, U., in Landegren, U., ed., Laboratory Protocols for Mutation Detection, Oxford University Press, Oxford, 1996, pp. 135-138). DNA synthesis has been limited to rates ranging between 50 and 300 nucleotides per second (Lizardi, cited above and Lee, J., et al., Molecular Cell, 1998, 1: 1001-1010).
In some embodiments of this invention, increased rates of DNA synthesis in RCA are achieved by the use of DNA polymerase III holoenzyme (also referred to herein as pol III) which has an intrinsic catalytic rate of about 700-800 nucleotides per second (Kornberg and Baker, cited above). The invention also applies to subassemblies of the pol III holoenzyme which lack one or more of the subunits found in the complete, native enzyme complex (Kornberg and Baker, cited above). The invention applies to DNA polymerase III holoenzyme derived from E. coli and also other bacteria, including gram-positive and gram-negative bacteria, or related DNA polymerases from eukaryotes that have clamp (PCNA) and clamp loader (RFC) components (Kornberg and Baker, cited above). These pol III-like DNA polymerases are evolutionarily distinguished from pol I-type enzymes (Braithwaite, D. K., and Ito, J., Nuc. Acids Res., 1993, 21:787-802.) that have previously been employed in RCA (Fire and Xu, Lui, D. et al., Lizardi et al., and Lee et al., all cited above).
Therefore, this invention introduces the novel use of a distinct class of DNA polymerases that have not previously been used in RCA. The methods are applicable to polymorphism detection, diagnostics, phase determination, genotyping, genomic mapping, DNA sequencing, synthesis of DNA probes, or cloning. The high rate of synthesis, great processivity, and ability to replicate through sequence obstructions give pol III an advantage over other DNA polymerases in RCA. The E. coli dnab, dnaG, and dnaC proteins or other helicases and the single-stranded DNA binding protein (SSB) can also be used to facilitate the reaction (Kornberg and Baker, cited above). This invention applies to the use of pol III with any accessory proteins including helicases, primases, and DNA binding proteins that facilitate the pol III reaction.
In another embodiment of the invention two or more DNA polymerases are combined in one RCA reaction. One of the polymerases may have a 3xe2x80x2xe2x86x925xe2x80x2 exonuclease activity capable of removing mismatched nucleotides. Such combinations of DNA polymerases are known to increase primer extension. (Cheng, S. et al., Proc. Natl. Acad. Sci. USA, 1994, 91:5695-5699.)
This invention further provides for a method to produce approximately equimolar rolling circle amplification of DNA fragment mixtures. The method is applicable to RCA of any DNA including for purposes of detection, cloning, generation of probes, genetic mismatch scanning (GMS) procedures, DNA mapping, sequencing, and genotyping. In an RCA using mixed circular DNA templates of different length, a greater number of copies of shorter circles will be generated relative to longer circles. This effect is reduced by creating a xe2x80x9cslow stepxe2x80x9d or xe2x80x9cpause sitexe2x80x9d that occurs once each time the DNA polymerase copies around the circle. Therefore, the DNA polymerase rapidly copies around the circles and then it pauses for the slow step before copying around the circle again. The number of copies made of each circle will tend to be the same, independent of the length of the circle. In one procedure, the pause site is created by the introduction of one or more abasic sites in the template. DNA polymerases are slowed but not completely blocked by such a site. They will tend to insert a nucleotide opposite the abasic site Randell, S. K., et al., J. Biol. Chem., 1987, 262:6864-6870).
In one embodiment of this invention, DNA fragments selected with genomic mismatch scanning are amplified by RCA. In 1993 Nelson and associates described and employed GMS to directly identify identical-by-descent (IBD) sequences in yeast (Nelson, S. F., et al., Nature Genetics, 1993, 4:11-18). The method allows DNA fragments from IBD regions between two relatives to be isolated based on their ability to form mismatch-free hybrid molecules. The method consists of digesting DNA fragments from two sources with a restriction endonuclease that produces protruding 3xe2x80x2-ends. The protruding 3xe2x80x2-ends provide some protection from exonuclease III (Exo III), which is used in later steps. The two sources are distinguished by methylating the DNA from only one source. Molecules from both sources are denatured and reannealed, resulting in the formation of four types of duplex molecules: homohybrids formed from strands derived from the same source and heterohybrids consisting of DNA strands from different sources. Heterohybrids can either be mismatch-free or contain base-pair mismatches, depending on the extent of identity of homologous regions.
Homohybrids are distinguished from heterohybrids by use of restriction endonucleases that cleave fully methylated or unmethylated GATC sites. Homohybrids are cleaved into smaller duplex molecules. Heterohybrids containing a mismatch are distinguished from mismatch-free molecules by use of the E. coli methyl-directed mismatch repair system. The combination of three proteins of the system MutS, MutL, and MutH (herein collectively called MutSLH) along with ATP introduce a single-strand nick on the unmethylated strand at GATC sites in duplexes that contain a mismatch (Welsh, et al., J. Biol. Chem., 1987, 262:15624). Heterohybrids that do not contain a mismatch are not nicked. All molecules are then subjected to digestion by Exo III, which can initiate digestion at a nick, a blunt end, or a recessed 3xe2x80x2-end, to produce single-stranded gaps. Only mismatch-free heterohybrids are not subject to attack by Exo III; all other molecules have single-stranded gaps introduced by the enzyme. Molecules with single-stranded regions are removed by absorption to benzoylated napthoylated DEAE cellulose. The remaining molecules consist of mismatch-free heterohybrids which may represent regions of IBD.
Methods are given for isolating DNA containing nucleotide base mispairs using a modified rolling circle amplification procedure. DNA fragments containing the base mismatches are nicked by conventional genomic mismatch scanning methods. The 3xe2x80x2-OH at the nick serves as a primer for DNA synthesis. The 3xe2x80x2-end is elongated by a DNA polymerase possessing strand displacement or nick translation capacity, or by a combination of a DNA polymerase capable of strand displacing at a nick and DNA polymerase III holoenzyme which provides a high rate of processive DNA synthesis. Specific Y-shaped adapters attached to the ends of the fragments are designed such that DNA products generated by the extension of the 3xe2x80x2-OH at the nick have a unique sequence at their 3xe2x80x2-end. The unique sequences allow for the selective circularization of these fragments with a complementary splint oligonucleotide. Rolling circle amplification is then carried out with a DNA polymerase. DNA polymerase III holoenzyme (herein referred to as pol III or pol III holoenzyme) is used to provide a superior rate of DNA synthesis and also high processivity which allows rapid replication through regions of high GC content, hairpin structures and other regions of secondary structure, and regions that normally slow replication due to local sequence context effects. The E. coli dnaB, dnaG, and dnaC gene products or other DNA helicases and the single-stranded DNA binding protein (SSB) are also used to improve the reaction. The use of pol III or pol III combined with other replication proteins is generally applicable to any RCA procedure in addition to methods specifically relating to GMS procedures. In another method that improves any RCA reaction in general, two DNA polymerases are combined together. One of the polymerases has a 3xe2x80x2xe2x86x925xe2x80x2 exonuclease activity capable of removing misincorporated nucleotides.
In addition, methods are given for carrying out rolling circle amplification of a mixture of DNA circles having different lengths. In general, more copies will tend to be made for shorter circles because the DNA polymerase requires less time to replicate them. For some procedures, including the amplification of DNA for cloning or detection purposes, or for genomic mismatch scanning, it is desirable to produce approximately equal numbers of all circles independent of their length. This is accomplished by creating a slow step in the replication process. Therefore, replication stops for a period of time once each time the DNA polymerase copies around the circle. The result of having one slow step for each copy of the circle that is synthesized is that the rate-limiting step for the amplification tends to be the same regardless of the length of the circle. This tends to minimize the disparity between the number of copies made for circles of different length. The xe2x80x9cslow stepxe2x80x9d is created by introducing a site on the DNA sequence where the DNA polymerase is slowed or otherwise partially obstructed. A slowing of the rate of DNA polymerization is typically created at so-called xe2x80x9cpause sitesxe2x80x9d at naturally occurring sequences where the local DNA structure is unfavorable for replication, or by introducing abasic sites which require longer times for the insertion of nucleotides by the DNA polymerase. Several types of potential pause sites are described herein. An alternative approach is to completely block the DNA polymerase with a reversible obstruction so that replication can be repeatedly stopped and then continued. For example, a properly designed hairpin structure can block replication at a low temperature while elevation to a higher temperature can be repeatedly used to allow the next cycle of DNA synthesis.
Methods are also given for determining the genetic phase of linked DNA markers by selective amplification of one parental haplotype. Several procedures are given for cutting DNA to create the target fragment to be analyzed, and circularizing the target DNA. Alternative procedures are also used to prime the DNA synthesis used for RCA. By circularizing the target fragment with an adapter for which only one of its strands can be ligated, a nick with a 3xe2x80x2-OH is created in the DNA circle that can serve as a primer for initiating rolling circle amplification. By using an adapter which has an internal single-stranded region and which also has double-stranded ends with appropriate overhangs for ligation to the target DNA, a single-stranded gap is introduced into the circularized adapter-fragment construct. This gap can be employed as a site for primer annealing facilitating the initiation of rolling circle amplification. The 3xe2x80x2-OH of the gap itself can also serve as a primer.
The single-stranded DNA product of rolling circle amplification can itself be replicated by annealing of complementary primers which can be extended in conventional primer elongation reactions or in hyberbranching reactions in which exponential amplification occurs (Lizardi, cited above). By choosing primers with 3xe2x80x2-ends complementary to one of two alleles, the DNA synthesis can be used for detection purposes. DNA polymerase III or DNA pol II derived from E. coli or other bacteria, or analogous polymerase complexes from eukaryotic organisms that also have clamp and clamp loader components (Kelman, Z., and O""Donnell, M., Annu. Rev. Biochem., 1995, 64:171-200, Bloom, L. B., et al., J. Biol. Chem., 1997, 272:27919-27930, and Kelman, Z., et al., Structure, 1998, 6:121-125) are used to facilitate amplification of DNA targets including large fragments that are difficult to replicate with other enzymes. DNA pol III is used to provide a superior rate of DNA synthesis and also high processivity which allows rapid replication through regions of high GC content, hairpin structures and other regions of secondary structure, and regions that normally slow replication due to local sequence context effects. DNA helicases such as the dnaB gene product and SSB of E. coli can be used to further improve rate and strand displacement. The superior performance of DNA pol III to other DNA polymerases gives an advantage in any genotyping, DNA mapping, DNA sequencing, or cloning work in which large DNA fragments, 1 kb to greater than a megabase in length are used, and also for shorter fragments where rate or strand displacement is important.
In addition, methods are given for converting DNA fragments into a form that can be utilized as RCA templates by ligation of hairpin forming adapters to the ends of the fragments. The adapters have 3xe2x80x2 and 5xe2x80x2 ends that are complementary to each other such that they form stem and loop structures. Furthermore, the stem portion of the hairpin structures create blunt or overhanging ends that allow the adapter to be ligated to the end of any DNA fragments having the appropriate end. By ligating such adapters to both ends of the DNA fragments, the fragments are converted to a circular form which can be utilized as the template for an RCA reaction. Also, the loop portion of the adapters provide a single-stranded region to which the RCA primer can be annealed.
Another invention uses two or more DNA polymerases in an RCA reaction. At least one of the DNA polymerases possesses a 3xe2x80x2-5xe2x80x2 exonuclease proofreading activity capable of correcting base mispairs. The removal of misincorporated bases allows for greater primer extension.