The present invention relates to classification and typing of prokaryotes and, more particularly, to abundant, well distributed and hyperpolymorphic simple sequence repeats in prokaryote genomes and the use of same for prokaryote classification and typing.
Simple sequence repeats (SSRs) are a class of short sequences, usually of 1-6 nucleotides, that are tandemly (i.e., head to tail) repeated from two or three up to a few dozen times at a locus (Vogt 1990). SSRs long have been known to be distributed throughout the genomes of eukaryotes and to be highly polymorphic (Tautz 1989, Weber 1990, Kashi et al. 1990). Polymorphisms arise primarily through slipped-strand mispairing during DNA replication (Strand et al. 1994, Tautz and Schlotterer 1994). There is accumulating evidence that SSRs serve a functional role, affecting gene expression (Kunzler et al. 1995, Kashi et al. 1997, King et al. 1997, Kashi and Soller 1998).
The sequencing of complete genomes of many prokaryotes presented the opportunity to screen such genomes for the existence of SSRs (Field and Wills 1996, 1998), revealing arrays not detected in earlier studies. Recent publication of the complete genome sequence for Escherichia coli (Blattner et al. 1997) provides the basis for characterization of its SSR arrays, both at a gross genomic level and at particular SSR loci.
Present-day approaches for typing of prokaryotes include growth in selective media, binding of specific antibodies, and amplification of DNA using the polymerase chain reaction. For example, conventional methods for detection of E. coli (Vanderzant and Spittstoesser 1992) include enrichment and isolation with selective or indicator media, such as E. coli (EC) broth, lauryl sulfate tryptose 4-methylumbeliferyl-.beta.-D-glucaronic acid broth, eosin methylene blue agar, and McConkey sorbitol agar. Procedures based on use of such media lead to identification of E. coli in a sample and estimation of number, but lack the ability to distinguish among E. coli strains. Hence, the entire process of strain identification remains difficult and time-consuming. Recent methods for identification of E. coli strains use antibodies or nucleic acid sequences that uniquely bind to a particular strain or group of strains. Several methods for immunological detection br biochemical identification of the toxin-producing E. coli strain 0157:H7 have been described (Farmer and Davis 1985, March and Ratnam 1986, Kleanthous et al, 1988, Smith and Scotland 1988, Todd et al. 1988, Karmali 1989, Padbye and Doyle 1991, Tyler et al. 1991). However, these assays do not distinguish among the various members of other serogroups. DNA amplification-based assays have been reported (Karch and Meyers 1989, Pollard et al. 1990, Johnson et al 1990, Johnson et al. 1991, Jackson 1991, Yu and Kaper 1992, Witham et al. 1996), but mostly have limitations including lengthy post-amplification detection protocols or lack of template quantification.
DNA sequence determination, on the other hand, is much more simple and accurate.
There is thus a widely recognized need for, and it would be highly advantageous to have, a simple and rapid DNA sequence based technique for the classification and typing of prokaryotes.
While conceiving the present invention it was assumed that prokaryotes SSRs might be polymorphic and that such polymorphism might be class and type correlated and, if indeed exists, could be used to provide a simple tool for the presently labor-intensive and complicated task of classification and typing of prokaryotes.
While reducing the present invention to practice, length polymorphism was shown at two mononucleotide SSR loci in E. coli. The existence of thousands of SSR arrays in E. coli and in a wide range of other prokaryotes that should exhibit hypervariability is shown as well. Interestingly, these SSR sites exhibit an upper size limit of 12 bp, suggesting selective mechanisms that might impose this size limit.