PCR is a powerful method by which specific sequences of DNA can be amplified exponentially in vitro without cloning. PCR requires that primer annealing sites be present in each end of the target sequence in order for amplification to occur. Until recently, PCR has required knowledge of the sequences initially flanking the target sequence. Consequently, PCR could not be used to amplify and sequence unknown DNA flanking a known sequence. Recently, various methods have been developed that permit the amplification of sequences which flank only one known primer annealing site. These methods permit the amplification of previously uncharacterized regions of DNA. However, none of these existing methods has been shown to have sufficient specificity for widespread application for the direct sequencing of unknown DNA in human genomic DNA.
These methods that have been developed to amplify unknown flanking DNA can be divided into the following six types. The first method involves the creation of new primer annealing sites by tailing all the strands in a mixture using non-template directed DNA polymerization with one deoxyribonucleotide and terminal transferase (Frohman et al., Proc. Natl. Acad. Sci. USA, 85, 8998-9002, 1988; Loh et al., Science, 243, 217-220, 1989; Ohara et al., Proc. Natl. Acad. Sci. USA, 86, 5673-5677, 1989). Due to this tailing reaction, all of the strands in the mixture contain a universal primer annealing site. This site is used in conjunction with a sequence-specific primer for the PCR amplification of unknown DNA flanking a known sequence.
The second method requires ligation of linkers or plasmid sequences to the ends of restriction enzyme digested DNA strands in the original mixture (Shyamala et al., Gene, 84, 1-8, 1990; Hovens et al., Nucleic Acids Res., 17, 4415, 1989). In both the first and second methods, the specificity of the reaction is diminished by the creation of a new primer binding site which is shared by the other sequences in the original mixture, thereby increasing the probability of unwanted products. It is not surprising, therefore, that these strategies have found little success in the specific amplification of genomic eukaryotic DNA.
In a modification of the second method, the third method adds a physical separation step in order to increase the specificity of the reaction. Ligation of linkers to the ends of restriction enzyme digested DNA is followed by primer extension with a biotinylated primer that anneals to the unique site. This is followed by separation with magnetic streptavidin (Rosenthal et al., Nucleic Acids Res., 18, 3095-3096, 1990). The third method has been used for the amplification of unknown DNA from genomic eukaryotic DNA, but the length of DNA amplified has not been shown, and the overall effectiveness of this method is not clear.
The fourth method involves polymerization from a single primer site followed by ligation of a linker (subsequently used as a primer annealing site) to the opposite end of the double-stranded primer-extended product (Pfeifer et al., Science, 246, 810-813, 1989; Mueller and Wold, Science, 246, 780-786, 1989; Steigerwald et al., Nucleic Acids Res., 18, 1435-1439, 1990). This method has been used to sequence a single product from human genomic DNA, but not without the complication of using the multiplex method of sequence analysis (Church and Kieffer-Higgins, Science, 240, 185-188, 1988). In addition, the fourth method may be limited by the efficiency of primer extension to a blunt ended product, the efficiency of blunt-end ligation, and by ligation of the linker (primer annealing site) to other double-stranded ends.
The fifth method is called vectorette PCR (Riley et al., Nucleic Acids Res., 18, 2887-2890, 1990). In this method, synthetic duplexes are ligated to the restriction enzyme digested ends of DNA. The unique feature of this method lies in the construction of the synthetic duplexes, termed vectorette units. Vectorette units contain a bubble region of non-complementarity. The vectorette PCR primer is identical to one of the non-complementarity portions. The vectorette PCR primer, therefore, contains no region of complementarity to the end modified DNA unless polymerase extension is initiated from an upstream portion of a DNA strand. The DNA strand of interest is amplified to the extent that this initial DNA primer extension (from the non-vectorette primer that anneals to the known region) is specific for the strand of interest. A limiting factor with this method may be the specificity in the primer extension step that generates an annealing site for the the vectorette primer. This is because primer extension from a site near the 5' end of any DNA strand will create an annealing site for the vectorette primer, which results in a PCR product.
The sixth method called inverse PCR (Ochman et al., Genetics, 120, 621-623, 1988; Triglia et al., Nucleic Acids Res., 16, 8186, 1988; Silver and Keerikatte, J. Virol., 63, 1924-1928, 1989) permits amplification of DNA flanking a known sequence by circularization of restriction enzyme digested DNA. This permits amplification of the flanking sequence by positioning two primers, each of which binds to the known sequence "inside out" on the circle. Therefore, this strategy maintains specificity at each primer binding site. Difficulties with inverse PCR include the requirement for two restriction sites that flank the priming region and inefficient PCR amplification of closed circular DNA. This inefficient PCR amplification occurs, if a convenient restriction enzyme site is not present to linearize the circle between the 5' ends of the two amplifying primers prior to PCR amplification (Silver and Keerikatte, J. Virol., 63, 1924-1928, 1989). Without linearization, double-stranded circular DNA is amplified much less efficiently than linear DNA (Jones and Howard, BioTechniques, 10, 62-66, 1991). Nicking the circles by heating ameliorates the difficulty in amplifying closed circular double-stranded DNA, but only a small percentage of the circles are nicked between the two 5' ends of the amplifying primers. Therefore, any increase in the initial amplification efficiency is suboptimal.
A major obstacle in using existing methods for the PCR amplification of specific sequences in genomic DNA is the occurrence of nonspecific amplification products. Under PCR conditions, the stringency of the priming (Sommer and Tautz, Nucleic Acids Res., 17, 6749, 1989) is seldom high enough to generate a pure product longer than 1 kilobase (kb) in highly complex mixtures, such as in human genomic DNA. This limits both the specificity of the reaction and the length of the amplifiable DNA. Use of nested primers (Mullis et al., Cold Spring Harbor Symposia on Quantitative Biology, Cold Spring Harbor Laboratory, Ll, 263-273, 1986; Haqqi et al., Nucleic Acids Res., 16, 11844, 1988) and size selection of the regions of interest by previous Southern blotting (Ochman et al., Genetics, 120, 621-623, 1988; Beck and Ho, Nucleic Acids Res., 16, 9051, 1988) diminish this problem. However, high background due to insufficient stringency during the PCR amplification of genomic DNA remains a significant problem. It is not surprising, therefore, that the methods to amplify unknown flanking DNA result in limited specificity, as the initial PCR amplification using these methods does not improve upon the specificity level conferred by conventional two primer PCR. Certainly, an approach that optimizes the specificity of amplification of the unknown sequence is advantageous, regardless of the other strategies used to increase specificity (nested primers, size selection, physical separation of biotinylated products with steptavidin).
One way that the present invention overcomes the limitations encountered by the known PCR amplification methods is to use only one restriction enzyme site that flanks the priming region, instead of the two restriction enzyme sites in inverse PCR. This site is contained in the unknown flanking DNA. Also, the method of the present invention does not generate the less efficiently amplified double-stranded DNA circle produced from the self-annealing and ligation reaction of inverse PCR.
Since the method of the present invention generates a single-strand template, and the placement of known DNA on the opposite end of the strand of interest requires sequence specific annealing, it yields very high specificity. Furthermore, exploitation of the known properties of a suitable DNA polymerase, in conjunction with this method, permits purification of the template in solution, rendering this method specific for the DNA of interest.