There are many situations where the identification of an organism in a sample is important. This is true in the analysis of clinical, veterinary, food and environmental samples. Currently it is possible to carry out such identification either by classical microbiological methods (where growth characteristics and/or biochemical parameters are monitored), by immunodiagnostic methods or by DNA-based methods.
DNA probes have been used for the identification of organisms as described for example in WO 89/06704 and U.S. Pat. No. 4,851,330. Many such probes derive from the observation (see Woese, Scientific American 244 (6) 1981 for review) that parts of the 16S and 23S ribosomal RNA (rRNA) sequences vary in different species. This information was used initially for phylogenetic analyses but it has more recently been used for DNA probe-based methods for the identification of organisms. The method by which the rRNA sequences which are characteristic of an organism were obtained depended initially on differential hybridization experiments in which the target DNA gave a positive signal and the organism from which it was required to be distinguished gave a negative signal or by procedures using hybridization experiments carried out in liquid in which sequences common to be both species are eliminated and sequences (which may be anonymous) which are unique to the organism of interest are retained. A final category of target sequences for DNA probes are regions of the genome which code for some antigen or biochemical product characteristic of the organism.
In all cases the success of the DNA probe depends on its ability to detect a target sequence in the organism of interest while it fails to hybridize to a panel of other organisms that are either closely related to the organism of interest or are likely to occur in the sample under study. Hybridization of a probe to target DNA depends on the DNA sequence and on the hybridization conditions used. There are well established guidelines for the selection of conditions which will allow DNA probes to distinguish between two very closely related sequences (Maniatis, T., et al. (1982) Cold Spring Harbor Publication). The design of DNA probes can be optimized if the DNA sequences targetted have maximum differences from those of other organisms and if a comprehensive data bank of DNA sequences in the region under study is available.
The DNA sequence of a segment of the genome of an organism can be obtained by isolating the DNA segment using a variety of techniques which are widely used by those that employ recombinant DNA methods (see Maniatis, T., et al. supra. More recently methods of amplification of the region of interest using methods such as the Polymerase Chain Reaction (PCR) (Saiki et al. (1985) Science 230 1350-1354) have been described.
The PCR technique requires two oligonucleotide primers that flank the DNA segment to be amplified. Repeated cycles of heat denaturation of the DNA, annealing of the primers to their complementary DNA sequences at a lower temperature, followed by extension of the annealed primer with a DNA polymerase, which may be thermostable, in the presence of the four deoxyribonucleotides gives rise to specific DNA sequences in sufficient quantities to be manipulated further.
In one use of the PCR method Chen, K., et al. (FEMS Microbiology Letters (1989) 57, 19-24) amplified E. coli using primers derived from the regions of the 16S rRNA gene which tend to be conserved in a variety of organisms examined. A similar approach has been used by Medlin, L., et al. Gene (1988) 71, 191-499) who amplified eucaryotic rRNA coding regions for a phylogenetic study.
The choice of a target sequence for a probe currently involves a) the identification of an area of sufficient inter-species diversity or variation that will allow for the provision of a specific probe and b) a target which is preferably present in the organism in a high number of copies. The rRNA gene products (16S and 23S) appear to fulfil both of these criteria and as such have been the target for many studies and, indeed, DNA probe kits directed to those regions are available commercially for some organisms. Most comparisons to date have been between the rRNA genes from different genera and these have highlighted a pattern of variable regions within the gene flanked by adjacent more conserved regions.
We have found that when related species are compared the "variable" regions are occasionally very similar (see Example 2), if not identical. This highlights the need for a method to obtain sequence data from a catalogue of organisms to allow one to select the correct probe sequence and the appropriate hybridization conditions or to identify regions of the genome of a microorganism in which greater variability occurs.
It is an object of the present invention to provide a method for obtaining DNA sequences which can be used to provide a DNA data base for the choice of probe and hybridization conditions in a rapid and useful manner.
It is a further object of the present invention to identify new target areas that contain a high degree of diversity between organisms and to generate highly specific DNA probes thereto in a variety of organisms of interest to the clinician, veterinary practitioner, the food technologist and the environmentalist.