For many purposes, it is important to be able to identify the species to which an organism belongs rapidly and accurately. Such rapid identification is necessary for pathogens such as viruses, bacteria, protozoa, and multicellular parasites, and assists in diagnosis and treatment of human and animal disease, as well as studies in epidemiology and ecology. In particular, because of the rapid growth of bacteria and the necessity for immediate and accurate treatment of diseases caused by them, it is especially important to have a fast method of identification.
Traditionally, identification and classification of bacterial species has been performed by study of morphology, determination of nutritional requirements or fermentation patterns, determination of antibiotic resistance, comparison of isoenzyme patterns, or determination of sensitivity to bacteriophage strains. These methods are time-consuming, typically requiring at least 48 to 72 hours, often much more. Other more recent methods include the determination of RNA sequences (Woese, in "Evolution in Procaryotes" (Schleifer and Stackebrandt, Eds., Academic Press, London, 1986)), the use of strain-specific fluorescent oligonucleotides (DeLong et al., Science 243, 1360-1363 (1989); Amann et al., J. Bact. 172, 762-770 (1990)), and the polymerase chain reaction (PCR) technique (U.S. Pat. Nos. 4,683,195 and 4,683,202 to Mullis et al.; Mullis & Faloona, Methods Enzymol. 154, 335-350 (1987)).
In addition, DNA markers genetically linked to a selected trait can be used for diagnostic procedures. The DNA markers commonly used are restriction fragment length polymorphisms (RFLPs). Polymorphisms useful in genetic mapping are those polymorphisms that segregate in populations. Traditionally, RFLPs have been detected by hybridization methodology (e.g. Southern blot), but such techniques are time-consuming and inefficient. Alternative methods include assays for polymorphisms using PCR.
The PCR method allows amplification of a selected region of DNA by providing two DNA primers, each of which is complementary to a portion of one strand within the selected region of DNA. These primers are used to hybridize to the separated strands within the region of DNA sought to be amplified, forming DNA molecules that are partially single-stranded and partially double-stranded. The double-stranded regions are then extended by the action of DNA polymerase, forming completely double-stranded molecules. These double-stranded molecules are then denatured and the denatured single strands are rehybridized to the primers. Repetition of this process through a number of cycles results in the generation of DNA strands that correspond in sequence to the region between the originally used primers. Specific PCR primer pairs can be used to identify genes characteristic of a particular species or even strain. PCR also obviates the need for cloning in order to compare the sequences of genes from related organisms, allowing the very rapid construction of phylogenies based on DNA sequence. For epidemiological purposes, specific primers to informative pathogenic features can be used in conjunction with PCR to identify pathogenic organisms.
Although PCR is a very powerful method for amplifying DNA, conventional PCR procedures require the use of at least two separate primers complementary to specific regions of the genome to be amplified. This requirement means that primers cannot be prepared unless the target DNA sequence information is available, and the primers must be "custom built" for each location within the genome of each species or strain whose DNA is to be amplified.
Although the newer methods have advantages over previous methods for genome identification, there is still a need for a rapid, simple method that can be applied to any species for which DNA can be prepared and that does not require reagents that are specific for each species or knowledge of the molecular biology, biochemistry, or DNA sequence of that species. It is also desirable that such a method be capable of identifying a species from a relatively small quantity of biological material. Additionally, it is highly desirable that such a method is also capable of generating polymorphisms useful in genetic mapping, especially of eukaryotes.
In addition to identification of related plant, animal and bacteria species, DNA segments or "markers" may be used to construct human genetic maps for genome analysis. Goals for the present human genome project include the production of a genetic map and an ordered array of clones along the genome. Using a genetic map, inherited phenotypes such as those that cause genetic diseases, can be localized on the map and ultimately cloned. The neurofibromatosis gene is a recent example of this strategy (Xu et al., Cell 62:599-608 (1990)). The genetic map is a useful framework upon which to assemble partially completed arrays of clones. In the short term, it is likely that arrays of human genomic clones such as cosmids or yeast artifical chromosomes (YACS, Burke et al., Science 236:806-812 (1987)) will form disconnected contigs that can be oriented relative to each other with probes that are on the genetic map or the in situ map (Lichter et al., Science 24:64-69 (1990)), or both. The usefulness of the contig map will depend on its relation to interesting genes, the locations of which may only be known genetically. Similarly, the restriction maps of the human genome generated by pulsed field electrophoresis (PFE) of large DNA fragments, are unlikely to be completed without the aid of closely spaced markers to orient partially completed maps. Thus, a restriction map and an array of clones covering an entire mammalian genome, for example the mouse genome, is desirable.
Recently, RFLPs that have Variable Number Tandem Repeats (VNTRs) have become a method of choice for human mapping because such VNTRs tend to have multiple alleles and are genetically informative because polymorphisms are more likely to be segregating within a family. The production of fingerprints by Southern blotting with VNTRs (Jeffreys et al., Nature 316:76-79 (1985)) has proven useful in forensics. There are two classes of VNTRs; one having repeat units of 9 to 40 base pairs, and the other consisting of minisatellite DNA with repeats of two or three base pairs. The longer VNTRs have tended to be in the proterminal regions of autosomes. VNTR consensus sequences may be used to display a fingerprint. VNTR fingerprints have been used to assign polymorprisms in the mouse (Julier et al., Proc. Natl. Acad. Sci. USA, 87:4585-4589 (1990)), but these polymorphisms must be cloned to be of use in application to restriction mapping or contig assembly. VNTR probes are useful in the mouse because a large number of crosses are likely to be informative at a particular position.
The mouse offers the opportunity to map in interspecific crosses which have a high level of polymorphism relative to most other inbred lines. A dense genetic map of DNA markers would facilitate cloning genes that have been mapped genetically in the mouse. Cloning such genes would be aided by the identification of very closely linked DNA polymorphisms. About 3000 mapped DNA polymorphisms are needed to provide a good probability of one polymorphism being within 500 kb of the gene. To place so many DNA markers on the map it is desirable to have a fast and cost-effective genetic mapping strategy.