Pathogenic strains of E. coli are a common target for identification in clinical settings. For example, E. coli O157:H7 is a pathogenic bacterium that causes severe diarrhea, hemorrhagic colitis and hemolytic uremic syndrome (Nataro and Kaper 1988, Whittam 1993). Rapid identification of E. coli O157:H7, other shiga toxin producing E. coli (“STEC”), other entero-pathogenic E. coli (e.g., O26:H11) and nonpathogenic E. coli is critical for proper treatment and control of epidemics (McDonald and Osterholm, 1993, Majkowski, 1997).
Additionally, in connection with identifying such bacteria, there is also interest in discovering which drugs are effective against such microorganisms so that a treatment regimen can be initiated. Many of the current methods that are used to diagnose pathogenic and drug resistant strains of bacteria require the isolation of the suspect sample from bacterial monocultures that must be incubated over a number of days only after which the pathogenic strain can be identified by performance of biochemical tests. (see review in Swaminathan 1994). Such tests include phage typing, sorbitol fermentation, beta-glucuronidase production, protein identification by immunological means, colony hybridization with DNA probes, and restriction fragment length polymorphism (“RFLP”) analysis.
Assays incorporating nucleic acid amplification have the potential to lower the costs and shorten considerably the assay time due to the increased organism-specific sensitivity and the ability to identify particular organisms (genera and strains) directly from mixed culture samples. Such shortening of time and lowering of costs will allow patient samples to be tested for the identification of specific pathogenic strains on a routine basis in contrast to the current practice of testing for such organisms on an “as needed” basis.
Numerous nucleic acid based methodologies have been devised for microorganism identification including the use of DNA amplification (see Arbeit 1995, Whelen and Pershing 1996). One method, the random amplification of polymorphic DNA, (“RAPD”), uses a single random primer in a polymerase chain reaction (“PCR”) to obtain a fingerprint of random amplification products. The RAPD technique suffers from the need to establish monocultures prior to strain detection and identification. Moreover, the RAPD technique requires the skill of a technician trained to interpret a complex pattern of bands generated by the technique.
Another method targets regions of nucleotide sequence that encode, or are related to the production of, a pathogen's toxin. For example, specific gene loci for shiga toxins, such as slt-I, and slt-II afford the ability to distinguish E. coli O157:H7 from strains of non-toxin producing pathogens. However, the use of toxin specific loci are limited because testing is only applicable to a narrow range of microbial species that encode the specific toxin being tested.
Another bacterial identification system uses 16S-ribosomal RNA (Weisberg et al. 1991, Neefs et al. 1993). The 16S system is less than desirable because multiple copies of the rRNA gene may contain multiple polymorphisms (a situation known as heteroplasmy) making identification of a specific bacterial strain difficult. Additionally, while the system is able to separate unknown bacteria to the species level, it is not possible to differentiate subspecies strains.
Still other methods which use genetic sequences in hybridization oriented identification are documented. In particular, a method of identifying microorganisms using polymorphisms within the type II DNA topoisomerase genes has been disclosed (Annu. Rev. Genet., 1996, Vol. 30, pp 79–107, by W. M. Huang, Antimicrobial Agents and Chemotherapy, September 1995, Vol. 39, No. 9, pp 2145–2149, by I. Guillemin et al., U.S. Pat. No. 5,645,994, by W. M. Huang, all of which are herein incorporated by reference).
Prokaryotic and eukaryotic type II topoisomerases are related in their structure and function. These molecules are essential for maintenance of DNA superhelicity for DNA replication. One type II topoisomerase from bacteria is DNA gyrase. Bacterial DNA Gyrase is composed of two subunits, GyrA and GyrB. The amino acid sequence of the GyrA subunit is highly conserved between prokaryotic and eukaryotic organisms. However, at the DNA level codon usage and G-C content are markedly divergent. The divergence in the nucleic acid sequences has provided the basis for the development of rapid methodology to identify new bacterial topoisomerase genes (see W. M. Huang 1996).
Comparison of a variety of prokaryotic Gyrase A genes shows that the length of the protein encoded by such genes is in the range of 850 amino acids. The overall identity among GyrA proteins from these different organisms is only about 40% with the greatest variability occurring in the C-terminal third of the sequences. However, the N-terminal portion of the genes are highly conserved, which conservation allows the grouping of the various species in a manner consistent with the grouping elucidated using rRNA sequence analysis. (see Neef, 1993, and Olsen, 1994).
The known GyrB subunit gene sequences encode proteins of between 650 and 800 amino acids in length. In general, the GyrB proteins from various organisms tested share approximately 60% overall amino acid sequence identity.
A second type II topoisomerase gene known in E. coli has sequence identity with the GryA. Specifically, the parC gene has 36% identity with and is generally shorter than the GryA gene encoding a protein of about 750 amino acids.
Alignment of type II topoisomerase genes from a variety of organisms has revealed that the N-terminal region is highly conserved at the amino acid level such that there are at least nine regions having at least five invariant amino acids interspersed with more variable regions. The consensus regions provide DNA sequences that are useful for designing “universal” primers for the amplification of intervening variable regions. The availability of nucleic acid sequences of the intervening variable regions has allowed identification of new topoisomerase genes in such organisms and consequently the ability to study biodiversity at the species level.
For example, U.S. Pat. No. 5,645,994 by W. M. Huang discloses a method of identifying species of bacteria by amplifying variable or “signature” sequences that are interspersed between the conserved sequences. The flanking conserved sequences are used to design universal primers for amplification of the signature sequences. Following amplification, the signature sequences are cloned and sequenced and the sequence is compared against a database of signature sequences from multiple species. Likewise, Huang discloses that alignment of the DNA sequences from isolates of one genus can be used to examine micro-diversity among species of a genus.
The current invention provides numerous polymorphisms recently discovered in the GyrA, GyrB, and parC subunits of topoisomerase type II genes that have application in the detection and identification of subspecies of pathogenic and nonpathogenic bacteria. The current invention also provides polymorphisms identified in the type II isomerases that are associated with drug resistance wherein the proteins and regulation of the genes in which the polymorphisms are found are not affected by or biochemically associated with the function of the drug.
For example, with regard to drug resistance, outbreaks of drug resistant strains of Staphylococcus aureus occur periodically in clinical environments such as in hospitals where there may be concentrations of patients suffering from compromised immune systems (Herwaldt and Wenzel 1995). Rapid identification of such resistant strains is recognized as being crucial for the adoption of appropriate treatment regimens (Morita 1993). Most important in such resistance outbreaks has been resistance to methicillin. With respect to methicillin resistance, a gene locus frequently responsible for such resistance is the mecA locus (see Archer and Neimeyer 1994) which has been, along with surrounding noncoding regions, the target of amplification-based assays (e.g., Murakami et al. 1991). While the mecA gene provides a direct link to methicillin resistance, the locus is specific to the genus Staphylococcus and thus is of limited utility as a general diagnostic because only drug resistant Staphylococcus aureus should be identified. Moreover, because mecA DNA is susceptible to horizontal transfer between bacteria, (Archer and Niemeyer 1994, Wu et al. 1996), the potential for misidentification exists causing serious drawbacks to the use of mecA as an identification marker for pathogenic S. aureus. 
In contrast, topoisomerase type II polymorphisms have been used to identify drug resistance in microorganisms. Specifically, the Gryase A gene has been used to study resistance of certain bacterial strains to fluoroquinolone (“FQ”) antibiotics. (e.g., Mycobacterium sp. Guillemin 1995, Campylobacter sp. and Helicobacter sp. Husmann 1997, and Staphylococcus aureus Wang 1998). Biochemically, FQ resistance functions because the mutation in the GyrA protein sequence interferes with the ability of the antibiotic to interact with GyrA/DNA complexes resulting in continued growth and division of the replicating organism. It has been observed that the mutations responsible for FQ resistance are clustered within a small pocket of amino acids in the N-terminal portion of the protein. Since the biochemistry and the genetics of the GyrA gene suggest the involvement of a small number of amino acids, the amino acids at these positions can be correlated with the general antibiotic susceptibility of these bacteria. Thus, as suggested by Guillemin, a screening method may be developed to identify species having resistance to FQ antibiotics based on the mutations in the Gyrase A gene.
Of greater significance, we have discovered polymorphisms in the Gyrase A gene that are associated with non-FQ antibiotics drug resistance that is not involved in or associated with the functionality of topoisomerase:DNA complexes. This discovery is very important because it indicates that polymorphisms in the GyrA subunit are indicative of subtle but distinct differences between organisms where there is no known evolutionary pressure that would assist an organism in developing such genetic divergence.
Although the prior disclosures are directed to the use of Gyrase A gene polymorphisms in the identification of species of organisms and at least one class of antibiotic resistance, such prior disclosures have failed to recognize or disclose a recognizable association between topoisomerase type II sequence polymorphisms and significant divergence between very closely related organisms. For example, pathogenic strains of E. coli that have been isolated and classified as strain 0157:H7 have been found to include numerous polymorphisms. Thus, it is questionable whether classifying such isolates as only one strain (i.e. 0157:H7) is satisfactory. Likewise, it has been found that E. coli strain K12, which has traditionally been attributed to be the same strain as wild type E. coli ATCC 11775, is divergent from the wild type strain and is actually a separate “laboratory” strain as indicated by divergence in the Gyrase A gene. (see below)
The current invention recognizes the importance of these subtle divergences within the GyrA, GyrB, and parC proteins of the topoisomerase family and provides numerous polymorphisms useful for the identification of closely related organisms that may be heretofore unrecognized subspecies variations within populations of organisms that have traditionally been classified together as a single species.