Interspersed repetitive DNA sequence elements have been characterized extensively in eucaryotes although their function still remains largely unknown. The conserved nature and interspersed distribution of these repetitive sequences have been exploited to amplify unique sequences between repetitive sequences by the polymerase chain reaction. Additionally, species-specific repetitive DNA elements have been used to differentiate between closely related murine species.
Prokaryotic genomes are much smaller than the genomes of mammalian species (approximately 10.sup.6 versus 10.sup.9 base pairs of DNA, respectively). Since these smaller prokaryotic genomes are maintained through selective pressures for rapid DNA replication and cell reproduction the non-coding repetitive DNA should be kept to a minimum unless maintained by other selective forces. For the most part prokaryotes have a high density of transcribed sequences. Nevertheless, families of short intergenic repeated sequences occur in bacteria.
The presence of repetitive sequences has been demonstrated in many different bacterial species. Reports of novel repeated sequences in the eubacterial genera, Escherichia, Salmonella, Deinococcus, Calothrix, and Neisseria, and the fungi, Candida albicans and Pneumocystis carinii, illustrate the presence of dispersed extragenic repetitive sequences in many organisms. One such family of repetitive DNA sequences in eubacteria is the Repetitive Extragenic Palindromic (REP) elements. The consensus REP sequence for this family includes a 38 mer sequence containing six totally degenerate positions, including a 5 bp variable loop between each side of the conserved stem of the palindrome. Another family of repetitive elements is the Enterobacterial Repetitive Intergenic Consensus (ERIC) sequences. ERIC is larger (consensus sequence is 126-mer) and contains a highly conserved central inverted repeat. The ERIC and REP consensus sequences do not appear to be related.
Previous studies have used repeated rRNA genes as probes in Southern blots to detect restriction fragment length polymorphisms (RFLPs) between strains. Repeated tRNA genes have been used as consensus primer binding sites to directly amplify DNA fragments of different sizes by PCR amplification of different strains. Limitations of both techniques include the use of radioisotope and time-intensive methods such as Southern blotting and polyacrylamide gel electrophoresis to clearly distinguish subtle differences in the sizes of the DNA fragments generated. The latter technique could only distinguish organisms at the species and genus level. The tDNA-PCR fingerprints are generally invariant between strains of a given species and between related species. Other previous studies include the use of species-specific repetitive DNA elements as primer-binding sites for PCR-based bacterial species identification. Though such methods allow species identification by PCR with picogram amounts of DNA, only single PCR products are generated which precludes the generation of strain-specific genomic fingerprints.
Although these previous studies demonstrated that species-specific repetitive DNA elements can be used as primer-binding sites for PCR-based bacterial species identification, these methods only generated single PCR products in a single species. The present invention provides a novel approach to using extragenic repetitive sequences to directly fingerprint bacterial genomes. Analysis of amplification products resulting from amplifying unique sequences between primers to bacterial DNA repeat sequences, reveals unique distances between repeat sequences. This pattern of distances uniquely fingerprints different bacterial species and strains. Thus, this approach provides a quick and reliable method to type bacteria by genomic fingerprinting.