The analysis of variation among polymorphic DNAs provides a valuable tool in medicine, forensic science, genetic engineering applications, gene mapping, and drug development. For example, variations in polymorphic DNAs allows one to distinguish one individual of a population from another, or to assess the predisposition of an individual to a heritable disease or trait.
If the variation is on the basis of the length of fragments generated by enzymatic cleavage (i.e., restriction endonuclease cleavage), the variations are commonly referred to as restriction fragment length polymorphisms (RFLPs). RFLPs are commonly used in human and animal genetic analyses (see, e.g., Skolnik et al (1982) Cytogen. Cell Genet. 32:58-67; and Botstein et al. (1980) Ann. J. Hum. Genet. 32:314-331), particularly in forensic applications. If a heritable trait can be linked to a particular RFLP, the presence of that RFLP in a subject can be used to predict the likelihood that the subject will exhibit the trait. Statistical methods have also been developed for multilocus analyses of RFLPs, for example wherein a genetic trait is linked to multiple allelic locations. Lander et al. (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357; Donis-Keller et al. (1987) Cell 51:319-337; and Lander et al. (1989) Genetics 121:185-199. RFLP analysis can also be used in genetic mapping techniques, as well as in genetic engineering.
Other variations on the basis of length are generally characterized by short tandem repeats (STRs) or microsatellite repeats, that is, genomic regions that contain a variable number repeated sequences (e.g., di-, tri-, tetra- or penta-nucleotide tandem repeats having lengths that range from roughly 80 to 400 bases and 3 to 15 alleles). Such repeats are common in the euchromatic arms of most mammalian chromosomes. When bracketed by some conserved sequence in which PCR primers can be found, DNA length polymorphisms can be used as length polymorphic markers in genetic mapping and forensic applications.
More particularly, length polymorphic markers are currently seeing widespread use in identifying genes via positional cloning and in genetic mapping in general. In such applications, a population known to exhibit a trait (e.g., a disease of interest) is genotyped to establish the pattern of inheritance of markers. Next, the correlation between marker inheritance and disease inheritance is determined (linkage analysis), from which one can determine which markers are physically close to the disease genes. The positions discovered are then used as starting points for cloning and sequencing, until the genes themselves are found. To perform genotyping, the exact length and/or sequence of many markers for many individuals needs to be determined. PCR amplification for each marker is performed, then the length of each PCR product is measured, for example using slab gel electrophoresis, capillary electrophoresis, or liquid chromatography.
Other aspects of, and different approaches to analysis of microsatellite length polymorphic markers are treated in Hall et al. (1996) Genome Res. 6:781-790; and Perlin et al. (1995) Am. J. Human Genetics 7:1191-1210. DNA profiling assays for detecting length polymorphisms using PCR amplification and differential labeling of each sequence fragment are also known. See, e.g., U.S. Pat. No. 5,364,759. Likewise, assays employing a PCR amplification test for bovine genetic markers linked to milk production have been described. See, e.g., U.S. Pat. No. 5,614,364. U.S. Pat. No. 5,436,130 describes a DNA sequencing method that employs single lane electrophoresis, a binary coding scheme using two different fluorescent labels, and a laser-excited, confocal fluorescence scanner for sequencing four sets of DNA sequencing fragments. A method which employs single lane electrophoresis and four different tags (fluorophores) for four sample fragments to be sequenced is described in Smith et al. (1986) Nature 321:674-679, and a method which employs one fluorescent tag for all fragments to be sequenced, but uses multiple lane electrophoresis (each fragment is run in its own lane) is described by Ansor et al. (1986) J. Biochem. Biophys. Methods 13:315-323.
Still further polymorphic DNA variation can be on the basis of sequence, for example those variations resulting from single nucleotide polymorphisms (SNPs) that exist between individuals of a particular population. In some instances, such sequence variations are characteristic of genetic disease; however, the majority of known SNPs occur in noncoding regions of a genome, and are thus useful for genotyping applications, gene mapping, drug development, forensics, and the like.