Genetic markers represent (mark the location of) specific loci in the genome of a species or closely related species. A sampling of different genotypes at these marker loci reveals genetic variation. The genetic variation at marker loci can then be described and applied to marker assisted selection, genetic studies, commercial breeding, diagnostics, cladistic analysis of variance, genotyping of samples, forensic analysis and the like.
Genetic markers have the greatest utility when they are highly heritable, multi-allelic, and numerous. Most genetic markers are highly heritable because their alleles are determined by the nucleotide sequence of DNA, which is highly conserved from one generation to the next, and the detection of their alleles is unaffected by the natural environment. Markers have multiple alleles because, in the evolutionary process, rare, genetically-stable mutations in DNA sequences defining marker loci arose and were disseminated through the generations along with other existing alleles. The highly conserved nature of DNA combined with the rare occurrence of stable mutations allows genetic markers to be both predictable and discerning of different genotypes.
DNA fingerprinting is a broad term used to designate methods for assessing sequence differences in DNA isolated from various sources, e.g., by comparing the presence of marker DNA in samples of isolated DNA. Typically, DNA fingerprinting is used to analyze and compare DNA from different species of organisms or DNA from different individuals of the same species. DNA sequence differences detected by fingerprinting are referred to as DNA polymorphisms. The presence of a DNA polymorphism in an organism""s DNA can serve to indicate that the genetic origin of such an organism is different from the genetic origin of organisms whose DNA does not have the polymorphism. Such polymorphisms can result, e.g., from insertion, deletion, and/or mutation events in the genome.
Many genetic-marker technologies are adaptable to fingerprinting, including restriction-fragment-length polymorphism (RFLP) Bostein et al (1980) Am J Hum Genet 32:314-331; single strand conformation polymorphism (SSCP) Fischer et al. (1983) Proc Natl Acad Sci USA 80:1579-1583, Orita et al. (1989) Genomics 5:874-879; amplified fragment-length polymorphism (AFLP) Vos et al. (1995) Nucleic Acids Res 23:4407-4414; microsatillite or single-sequence repeat (SSR) Weber JL and May PE (1989) Am J Hum Genet 44:388-396; rapid-amplified polymorphic DNA (RAPD) Williams et al (1990) Nucleic Acids Res 18:6531-6535; sequence tagged site (STS) Olson et al. (1989) Science 245:1434-1435; genetic-bit analysis (GBA) Nikiforov et al (1994) Nucleic Acids Res 22:4167-4175; allele-specific polymerase chain reaction (ASPCR) Gibbs et al. (1989) Nucleic Acids Res 17:2437-2448, Newton et al. (1989) Nucleic Acids Res 17:2503-2516; nick-translation PCR (e.g., TaqMan(trademark)) Lee et al. (1993) Nucleic Acids Res 21:3761-3766; and allele-specific hybridization (ASH) Wallace et al. (1979) Nucleic Acids Res 6:3543-3557, (Sheldon et al. (1993) Clinical Chemistry 39(4):718-719) among others. Kits for RAPD and AFLP analyses are commercially available, e.g., from Perkin Elmer Applied Biosystems (Foster City, Calif.). For example, the restriction fragment length polymorphism (RFLP) technique employs restriction enzyme digestion of DNA, followed by size separation of the digested DNA by gel electrophoresis, and hybridization of the size-separated DNA with a specific polynucleotide fragment. Differences in the size of the restriction fragments to which the polynucleotide probe binds reflect sequence differences in DNA samples, or DNA polymorphisms. See Tanksley, Biotechnology 7:257-264 (1988).
PCR-based fingerprinting methods result in the generation of a large number of reproducible DNA fragments of specific size that can be separated, typically by gel electrophoresis. These fragments are visualized to produce a xe2x80x9cfingerprintxe2x80x9d of the amplified DNA. Visualization of the size-separated fragments is effected either by direct visualization, e.g., with a fluorescent dye, by hybridization with a polynucleotide probe, or by labeling the amplification products during PCR (radioactively or flourescently) followed by detection of the labeled products in the gel. These fingerprints have a variety of uses: parentage analysis, linkage analysis of specific traits, analysis of the degree of generic relationship between individuals within a species and analysis of phylogenetic relationships between species. This has considerable commercial use in agriculture for marker assisted selection of genetic traits specific to particular genotypes (e.g., in crops or animals), identification and mapping of quantitative trait loci (QTLs) and the like.
A problem common to all DNA fingerprinting techniques in the prior art stems from the low throughput of the techniques. There exists a need to simplify and speed the DNA fingerprint analysis. The RFLP technique attempts to solve this problem by producing a limited number of DNA fragments by selective use of restriction enzymes, size separating DNA fragments using gel electrophoresis and employing specific polynucleotide probes to visualize a small number of DNA fragments at any one time. The RAPD and SSR techniques selectively amplify only one or a few fragments at a time and this small array of fragments is separated by gel electrophoresis and visualized. The AFLP technique also selectively amplifies certain restriction fragments, followed by size separation using acrylamide ,sequencing gels. DNA fragments are visualized by autoradiography or detection of fluorescence of labeled DNA molecules which were produced using labeled primers during the amplification procedure.
Each prior art fingerprinting technique is of limited usefulness because each fingerprint is generated by size separation using gel electrophoresis of each DNA sample analyzed. No meaningful data is generated without electrophoresis of the DNA samples to be analyzed. Both polyacrylamide and agarose gel electrophoresis are time consuming. Each DNA fingerprint using prior art methods requires running a gel, visualizing the DNA fragments on the gel, and analyzing the DNA fragment pattern. Thus, the number of DNA polymorphisms that can be analyzed at one time is limited by the time and cost of preparing and analyzing a gel electrophoresis fingerprint. Data density is limited by the resolution of the gels and capability of image analysis systems to reproducibly record the sizes of the separated fragments. In addition, the utility of existing methods is limited because the identity of each band amplified or hybridized is normally by size rather than sequence, making it difficult or impossible to precisely correlate bands on gels and alleles.
Therefore, it would be very useful to have a method for DNA fingerprinting that does not rely on gel electrophoresis for the generation of fingerprint information. Such a method would not require analysis of the complex data in a gel fingerprint and would allow the production of more DNA polymorphism data in less time and at a lower cost compared to levels currently achievable using prior art methods. In addition, a method which uses polynucleotide probes of known sequence has the advantage of being able to specifically associate DNA markers with alleles. This invention fulfills these and other needs.
The invention provides compositions, probes, methods of fingerprinting and genotyping, new marker assisted selection methods, methods of making probes, integrated systems for performing high-throughput assays, and other features which will be apparent upon reading this disclosure.
The fingerprinting methods herein do not rely on the rate-limiting step of gel electrophoresis for the generation of DNA fingerprints and can, therefore, produce a large number of DNA fingerprints in a short time. In one preferred embodiment, AFLP is used to identify differentially amplified nueleic acids, which are then converted into polynucleotide probes which map to polymorphisms. The differentially amplified AFLP DNAs are converted into polynucleotide probes by isolating individual polymorphic AFLP fragments from a mixture of fragments in an AFLP amplification product, followed by using these isolated fragments (dr clones or subclones thereof) as polynucleotide probes in hybridizations with immobilized DNA amplification mixtures (e.g., AFLP products). To generate a DNA fingerprint, a polynucleotide probe made according to the method of the invention is hybridized to a mixture of AFLP amplified DNA restriction fragments from DNA samples, generating a xe2x80x9cpositivexe2x80x9d or xe2x80x9cnegativexe2x80x9d hybridization result. Many unique DNA samples (typically in the thousands) can be analyzed together in a single hybridization. A series of hybridizations yields a unique fingerprint of each DNA sample in the analysis set of samples. This method is an improvement over the gel-based AFLP technique, which relies on gel electrophoresis for the production of every DNA fingerprint, significantly lowering the number of samples that can be analyzed easily. Gel-based AFLP techniques also suffer from the lack of a precise method for distinguishing AFLP fragments that have different sequences but have the same length. The hybridization-based assays of the invention can easily distinguish fragments with different sequences. Hybridization improves the genotyping capability of the AFLP technique in both sample throughput and specificity.
The techniques of the invention are adaptable to characterization of any biological nucleic acid (RNA, cDNA, genomic DNA, synthetic DNA or the like). In one aspect, a probe which hybridizes to a marker in linkage disequilibrium with a polymorphism is provided. The probe can be provided, e.g., by isolating, cloning, sub-cloning or synthesizing a nucleic acid corresponding to (the same as or hybridizing to) a marker such as a differentially amplified AFLP fragment. An exemplar probe is an oligonucleotide between about 8 and about 100 nucleotides in length corresponding to a polymorphic nucleotide marker nucleic acid. The probe is hybridized to a mixture of amplified biological DNA which includes a target nucleic acid which has the polymorphism as a subsequence. The amplified DNA can be amplified, e.g., by cloning, PCR, LCR, TAS, 3SR, NASBA, Qxcex2 amplification or the like. The DNA is optionally heterogenous by either size or sequence, or both. Typically, the amplified DNA is genomic DNA (including cellular genomic DNA, and DNA from an organelle such as a mitochondria, chloroplast or the like), or cDNA. In a preferred assay format, the amplified DNA mixture or the probe is fixed to a solid support.
The invention further provides methods of mapping polymorphic genetic markers. In the methods, a mixture of restriction enzyme-digested nucleic acids from biological samples is provided. The mixture is amplified, thereby identifying a set of differentially amplified nucleic acids in the mixture, and at least one of the differentially amplified nucleic acids is mapped to a unique genetic polymorphism, thereby providing a marker for the polymorphism. Typically, more than one differentially amplified nucleic acid is mapped, thereby providing a set of markers. The set can be of any size, although more information is provided by larger sets. Typical set sizes are from about 1-100 markers, often 10-50 markers, generally about 10-30 markers. In one typical format, the method includes hybridizing a probe nucleic acid to a mixture of DNA amplified from a biological source of DNA comprising the polymorphism, thereby identifying the polymorphism in the biological source of DNA. In this format, the probe nucleic acid hybridizes under stringent conditions to a target nucleic acid comprising the polymorphism. This information is typically used to genotype a biological sample, e.g., for marker assisted selection.
In several embodiments, the invention comprises detection of target nucleic acids in an amplified mixture of DNA, by hybridizing a probe to the amplified mixture. Depending on the available equipment and intended application, many hybridization formats are desirable. For example, either the amplified mixture or the probe can be fixed to the solid support. Typically, the solid phase of the assay will be in an array format, with either selected probes or selected amplified mixtures being fixed to predetermined locations of the array, facilitating consideration of hybridization signal information. The assays may be performed in serial or in parallel formats, i.e., by simultaneously or serially measuring hybridization results of probe-amplification mixture hybridization. Many other variations will be apparent upon full review of this disclosure.
The invention also provides probes, compositions and methods of making probes. For example, the invention provides compositions having a marker nucleic acid which specifically hybridizes to a nucleotide polymorphism and an amplified mixture of DNA isolated from a biological source.
Probes used in the above assays can be made by providing first and second samples of amplified DNA, comparing the first and second samples of amplified DNA to identify differentially amplified DNAs, isolating the differentially amplified DNA, thereby providing isolated differentially amplified DNAs and genetically mapping the isolated differentially amplified DNA, thereby providing a genetically mapped isolated DNA, which hybridizes to a unique polymorphic nucleic acid. Typically, at least a portion of the genetically mapped isolated DNA is sequenced to identify associated polymorphisms. Oligonucleotides comprising a portion of the sequenced region are also provided. Preferred probes uniquely map to single sites in a haploid genomic DNA of a plant or animal, or to cDNA.
Any of the assays or compositions provided herein are optionally provided or practiced in kit form. Kits optionally have one or more component selected from the components consisting of a container, instructional materials, one or more control nucleic acids complementary to the markers, and recombinant cells comprising one or more target nucleic acids.