The invention relates to novel polymerase chain reaction (PCR) amplification techniques and their use, for example, for identifying single nucleotide polymorphisms.
Dense linkage maps are invaluable tools for genetic and genomic analysis. They facilitate high resolution genetic mapping, positional cloning of monogenic traits, genetic dissection of polygenic traits, fine-structure linkage disequilibrium studies, and the construction of genome-wide physical maps. Historically, genetic maps were constructed with visible markers, but it is difficult to examine many such markers in a single cross. The recognition that distantly related individuals differ in DNA sequence throughout their genome (Botstein et al., Am. J. Hum. Genet. 32: 314-331, 1980) led to the rapid incorporation of DNA markers into mapping strategies. Useful DNA markers have the following general characteristics: (1) they are inherited in a Mendelian fashion; (2) they are present in most individuals analyzed and recognize a sequence that is polymorphic; (3) they correspond to a single site in the genome; (4) the probe used to recognize the marker hybridizes selectively and efficiently, even under conditions of low stringency; and (5) they can be distributed throughout a community, either as clones or as DNA sequences.
Until recently, the most commonly used DNA markers were restriction fragment length polymorphisms (RFLPs), anonymous single copy-number genomic clones that reveal a polymorphism in the length of a restriction fragment, typically by DNA blot hybridization. RFLP mapping is well-suited for determining the genetic location of any newly-cloned DNA sequence; the DNA fragment can be used as a hybridization probe (assuming it detects an RFLP) against the DNA filters used to construct the RFLP map. However, in many cases, new genes are identified by mutations, and mapping such a mutation onto an RFLP map can be a lengthy and arduous procedure.
In general, the invention features a method for determining whether a nucleic acid sequence includes a particular allele of a polymorphic sequence, involving:
(a) contacting a nucleic acid sequence, in the same or a separate reaction, with a first pair of PCR primers and a second pair of PCR primers under conditions that allow hybridization of the PCR primers to the nucleic acid sequence, the first pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, and the second pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, the PCR primers being characterized as follows:
(i) one of the first pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to a first allele of the polymorphic sequence (allele A), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to a second allele of the polymorphic sequence (allele B), and (c) being non-complementary to the nucleic acid sequence at a single non-complementary nucleotide in its 3xe2x80x2-terminal nucleotides 2-6; and
(ii) one of the second pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the second allele of the polymorphic sequence (allele B), and (c) being non-complementary to the nucleic acid sequence at one (and, preferably, two) or more nucleotides in its 3xe2x80x2-terminal nucleotides 2-6;
(b) carrying out the amplification reactions; and
(c) detecting an amplification product as an indication of the presence, in the nucleic acid sequence, of the first allele of the polymorphic sequence (allele A).
If desired, the method may involve the further steps of:
(a) contacting the nucleic acid sequence, in the same or a separate reaction, with a third pair of PCR primers and a fourth pair of PCR primers under conditions that allow hybridization of the PCR primers to the nucleic acid sequence, the third pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, and the fourth pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, the PCR primers being characterized as follows:
(i) one of the third pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the second allele of the polymorphic sequence (allele B), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), and (c) being non-complementary to the nucleic acid sequence at a single nucleotide in its 3xe2x80x2-terminal nucleotides 2-6; and
(ii) one of the fourth pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the second allele of the polymorphic sequence (allele B), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), and (c) being non-complementary to the nucleic acid sequence at one (and, preferably, two) or more nucleotides in its 3xe2x80x2-terminal nucleotides 2-6;
(b) carrying out the amplification reactions; and
(c) detecting an amplification product as an indication of the presence, in the nucleic acid sequence, of the second allele of the polymorphic sequence (allele B).
In a related aspect, the invention features kits for carrying out the method of the invention. One particular kit for determining whether a nucleic acid sequence includes a particular allele of a polymorphic sequence includes (a) a first pair of PCR primers and a second pair of PCR primers, the first pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, and the second pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, the PCR primers being characterized as follows: (i) one of the first pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to a first allele of the polymorphic sequence (allele A), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to a second allele of the polymorphic sequence (allele B), and (c) being non-complementary to the nucleic acid sequence at a single non-complementary nucleotide in its 3xe2x80x2-terminal nucleotides 2-6; and (ii) one of the second pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the second allele of the polymorphic sequence (allele B), and (c) being non-complementary to the nucleic acid sequence at one (and, preferably, two) or more nucleotides in its 3xe2x80x2-terminal nucleotides.
If desired, the kit may also include (a) a third pair of PCR primers and a fourth pair of PCR primers, the third pair of PCR primers hybridizing to opposite strands of said nucleic acid sequence and bordering the position of the polymorphic sequence, and the fourth pair of PCR primers hybridizing to opposite strands of the nucleic acid sequence and bordering the position of the polymorphic sequence, the PCR primers being characterized as follows: (i) one of the third pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the second allele of said polymorphic sequence (allele B), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), and (c) being non-complementary to the nucleic acid sequence at a single nucleotide in its 3xe2x80x2-terminal nucleotides 2-6; and (ii) one of the fourth pair of PCR primers (a) being complementary at its 3xe2x80x2-terminal nucleotide to the second allele of the polymorphic sequence (allele B), (b) being non-complementary at its 3xe2x80x2-terminal nucleotide to the first allele of the polymorphic sequence (allele A), and (c) being non-complementary to the nucleic acid sequence at one (and, preferably, two) or more nucleotides in its 3xe2x80x2-terminal nucleotides 2-6.
In preferred embodiments of any of the above methods or kits, the amplification reaction involving the first pair of PCR primers and the amplification reaction involving the second pair of PCR primers have different ranges of specificity; have ranges of specificity that overlap; and together have a greater than 3000-fold, and preferably at least a 10,000-fold, range of specificity.
In addition, the methods and kits are used to identify a single nucleotide polymorphism; each of the primers of the first and the second primer pairs that includes a non-complementary nucleotide in 3xe2x80x2-terminal nucleotides 2-6 may also include a unique hybridization tag and/or a universal primer binding site; the detection step is facilitated by the hybridization tag and/or the universal priming site; and the detection step is carried out on a solid support (for example, a chip) to which a binding partner for each hybridization tag is immobilized.
As used herein, by xe2x80x9cpolymorphic sequencexe2x80x9d is meant any nucleotide sequence capable of variation, and by xe2x80x9callelexe2x80x9d is meant one such variation. Preferably, such a variation is common in a population of organisms and is inherited in a Mendelian fashion. Such alleles may or may not have associated phenotypes. A xe2x80x9csingle nucleotide polymorphismxe2x80x9d (or xe2x80x9cSNPxe2x80x9d) is one type of xe2x80x9cpolymorphic sequencexe2x80x9d which is characterized by a sequence variation of only one nucleotide.
By xe2x80x9crange of specificityxe2x80x9d is meant the range of nucleic acid template:PCR primer ratios at which template sequences differing by at least one nucleotide may be discriminated by assaying for the presence of detectable PCR amplification product formation.
By xe2x80x9chybridization tagxe2x80x9d is meant an oligonucleotide that differs sufficiently in sequence from a target nucleic acid (for example, a target nucleic acid to be amplified) that significant cross-hybridization does not occur. When multiple hybridization tags are utilized in a single reaction mixture, these tags also preferably differ in sequence from one another such that each has a unique binding partner.
As described more fully below, the technique described herein provides a significant advance over other PCR-based techniques, particularly for carrying out genomic mapping analyses. For example, one widely used, more conventional PCR-based approach involves the use of single, short PCR primers of arbitrary sequence (called xe2x80x9cRAPDxe2x80x9d primers for xe2x80x9crandom amplified polymorphic DNA;xe2x80x9d Williams et al., Nucleic Acids Research 18: 6531-6535, 1990). In a given individual, amplification with a RAPD primer typically results in the synthesis of one or more DNA fragments, while in another individual, the primer fails to amplify the same set of fragments. Because RAPD markers are dominant, they do not allow heterozygotes to be reliably scored (see Botstein et al., 1980, supra). In addition, because RAPD primers typically have low melting temperatures, the amplification of a specific sequence or sequences using such a primer is highly sensitive to PCR conditions, including template concentration and annealing temperature. It is thus often difficult to correlate results obtained by different research groups (Devos and Gale, Theor. Appl. Genet. 84: 567-572, 1992). Finally, because RAPD primers frequently amplify more than one sequence, resulting in multiple bands, analysis of the results can be complicated (Riedy et al., PCR. Nucleic Acids Research 20: 918, 1992).
Similarly, another technique in current usage exploits xe2x80x9cAFLPs,xe2x80x9d or xe2x80x9camplified fragment length polymorphisms.xe2x80x9d In this method, DNAs from two polymorphic individuals are cleaved with one or two restriction endonucleases and adapters are ligated to the ends of the cleaved fragments (Vos et al., Nucleic Acids Research 23: 4407-4414, 1995). The fragments are then amplified using primers that are homologous to the adapter(s) which contain a short stretch of random nucleotides at the 3xe2x80x2 end. These random nucleotides limit the number of amplified fragments and reveal polymorphisms between the two individuals which are detected by displaying the amplified products on an acrylamide sequencing gel. Although large numbers of AFLPs can be detected in a single lane in a sequencing gel, this technique is limited by its requirement for acrylamide gel detection, as well as by the fact that many fragments are generally amplified in each lane, resulting in a complicated pattern that requires expensive, automated high-resolution imaging technology to reliably decipher.
Finally, in yet another PCR technique, markers referred to as xe2x80x9csimple sequence length polymorphismsxe2x80x9d or xe2x80x9cSSLPsxe2x80x9d are utilized. These makers are based on amplification across tandem repeats of one or a few nucleotides known as xe2x80x9cmicrosatellites.xe2x80x9d Microsatellites occur randomly in most eukaryotic genomes and display a high degree of polymorphism due to variations in the number of repeat units. Simple sequence repeats are very abundant in most mammalian genomes, and the most common simple sequence repeat is (CA)n (Dietrich et al., Proc. Natl. Acad. Sci. USA 92: 10849-10853, 1995). The repeat length varies among individuals in a species, apparently due to slippage during DNA replication (Dietrich et al., Genetics 131: 423-447, 1992). One major advantage of SSLPs is that they are co-dominant markers. That is, different patterns are obtained for organisms that are homozygous and heterozygous for the paternal alleles. Another advantage of SSLPs is that, because they are highly polymorphic at a given locus, randomly selected SSLPs are likely to be informative in any given mapping population, and are therefore especially useful for studying evolutionary relationships. However, like AFLPs, certain SSLP markers can only be assayed by acrylamide gel electrophoresis and currently available SSLP assay methods are not suited to high throughput analysis using micro DNA arrays (for example, displayed on DNA chips) (Fodor et al., Science 251: 767-773, 1991; Chee et al., Science 274: 610-614, 1996; and Southern, Trends in Genetics 12: 110-115, 1996).
In contrast to the above techniques, the presently claimed approach provides a method for mapping polymorphic alleles that combines a number of advantageous features into a single format. First, the present technique makes use of allele-specific markers that are co-dominant; this facilitates the identification of polymorphic markers in homozygotes as well as heterozygotes. In addition, the present PCR technique may be readily automated, making it a practical method for large scale mapping efforts. This automation feature stems from the fact that the technique makes use of two allele-specific primers for each particular allele having different and complementary ranges of specificity, a feature that results in an increase in the range of template DNA concentrations that may be reliably assayed. This aspect of the invention is particularly important because determinations of sample DNA concentrations need not be measured, allowing the present technique to be used in conjunction with increasingly popular solid state formats, such as DNA chip formats.
Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.