I. Field of the Invention
The present invention relates to the general field if biochemical assays and separations, and to apparatus for their practice, generally classified in U.S. Patent Class 435/6.
II. Description of the Prior Art
Unlike multicellular organisms, bacteria and simple eukaryotic microorganisms have very limited morphological diversity and typically do not leave a significant fossil record. It therefore was initially very difficult to develop a classification system, which reflects actual genetic relationship. Instead, classic bacterial taxonomic methods, such as morphology and carbon source utilization were used to classify bacteria in a deterministic way. The goal was to develop a hierarchy of tests that ultimately could reproducibly assign a consistent name to an unknown isolate. When organisms gave very similar results on the various tests they would ultimately be assigned to the same species regardless of actual genetic relationship. Thus, organisms were sometimes grouped together that were fundamentally very different.
This situation changed dramatically in the 1970's due to the pioneering work of Carl Woese and his colleagues. In order to obtain a genotypic classification, methods based on molecular sequence analysis of ribosomal RNA (rRNA) were developed. The rRNAs offered the advantage of being found in all organisms and the equivalent molecules could be readily isolated and purified from essentially any organism. The large ribosomal RNAs vary in length depending on the organism and therefore have different names, e.g. 16S rRNA, 18S rRNA etc, depending on the organism under consideration. To avoid this difficulty, the terminology small subunit RNA (SSU RNA) and large subunit RNA (LSU RNA) is used to specify any of the RNAS belonging to each class. Among the rRNAs, 5S rRNA with approximately 120 nucleotides was thought to be too short to be useful and the LSU RNA, (23S rRNA in bacteria), would have been far more difficult to work with. Attention therefore focused on the SSU RNA (16S rRNA in bacteria). 16S rRNA is a major component of the bacterial small ribosomal subunit. It consists of approximately 1,550 ribonucleotides in Escherichia coli and has an intricate secondary structure featuring extensive intrachain base pairing. The detailed three-dimensional folding of 16S rRNA in the Thermus aquaticus 30S ribosomal subunit has recently been determined by X-ray crystallography. As a major component of the ribosome, 16S rRNA interacts with 23S rRNA to establish the overall geometry of the ribosome and is directly involved in the initiation of protein biosynthesis by ribosomes.
When Woese first began using 16S rRNA in his evolutionary studies it was not technically feasible to sequence the entire RNA. Therefore a characterization approach was developed (Uchida et al., 1974) in which the 16S rRNA was fragmented by the nuclease, ribonuclease T1. This enzyme cleaves the RNA at guanosine (G) residues and thereby reduced the RNA to a collection of fragments of various lengths with a single terminal G. The non-G portion of the fragment was then sequenced. The lists of all such fragments obtained from a single RNA was referred to as a catalog. Catalogs of ribonuclease T1 fragments from 16S rRNAs isolated from a variety of organisms were compared to one another and cluster analysis was used to construct a tree of relationship between the various bacteria (Fox et al., 1977). By 1980, enough data of this type had accumulated that it was possible to construct the first trees that seriously attempted to identify the actual historical relationships between the various types of bacteria (Fox et al., 1980; Woese, 1987).
Later, as sequencing technology was improved, it became possible to sequence and compare entire 16S rRNAs.
In an effort to better understand the tree produced by cluster analysis, an alternative means of examining relationships known as “signature analysis” was developed (Woese et al., 1980). It was observed that certain of the ribonuclease T1 fragments were only found in a subset of the 16S rRNA catalogs. Frequently there was more than one such sequence that was uniquely found in the same group of organisms. Thus, the term “signature” was introduced as follows: “a set of oligonucleotides that is characteristic of (unique to) a group of organisms defines that group and is a “signature” for the group”. These signatures suggested that there was a relationship between the organisms in the group and so the tree was examined to see if the tree-generating algorithm had in fact found the expected relationship.
This process of checking the reasonableness of trees produced from the cataloging data was employed on several occasions (Woese et al., 1980; Woese et al., 1984; McGill et al., 1986). In its final rendition, (McGill et al., 1986) the notion of a signature quality index that could be calculated for every individual RNAse T1 oligonucleotide was introduced as a means of formalizing the extent to which there was or was not a signature for each branch in the tree.
Today, comparison of 16S rRNA sequences is widely used to establish the genetic relationship between bacteria. A typical approach is to amplify and sequence 16S rDNA from various prokaryotic organisms. The resulting sequences are aligned with other 16S rRNA sequences and an appropriate method, e.g. maximum likelihood, is used to construct a tree that reflects likely historical relationships. Several public databases exist containing complete and partial small subunit rRNA sequences. For example, release 8 of the RDP database (Maidak et al., 2000) includes data far the small subunit RNA from over 16,000 bacteria, eukaryotes, plastids and mitochondria.
As Woese's work became well known it began to be appreciated that RNA might be useful in detecting the presence of a target organism in a test sample. Thus, in 1980 Kohne applied for patents (U.S. Pat. No. 4,851,330 granted 25 Jul. 1989 and U.S. Pat. No. 5,288,611 granted Feb. 22, 1994) the essence of which is that a nucleic acid probe that is complementary to the rRNA of a specific target can be used to detect the presence of that target. This core approach has been widely used in microbial identification with probes usually being devised by sequence comparison rather than Kohne's preferred embodiment that was subtractive hybridization. Several commercial products rely on this approach.
The invention described here provides a novel approach for rapidly determining the genetic affinity of organisms in test sample. The invention's methodology is far more general than the specifically targeted tests of the Kohne approach, and faster and more convenient than detailed sequencing of the rRNAs or their encoding DNA. The method of this invention is currently most readily utilized with 16S rRNA sequence data but can be adapted to other data sets such as rRNA spacers, RNAse P RNA, genomic DNA or RNA of viruses, etc. One begins by defining microbial groups within a phylogenetic tree that includes the organism range of interest, e.g. all bacteria for example. Then a set of characteristic oligonucleotides, each of which identifies a group in the phylogenetic tree, is determined according to a newly developed algorithm of the invention. This set of signature oligonucleotides is utilized in a hybridization experiment, e.g. a DNA microarray, the results of which are then used to quickly identify the phylogenetic neighborhood of a problematic bacterium, or other microorganism. These hybridization experiments can be miniaturized so that minimally trained personnel can readily conduct them in difficult environments. The set of signature oligonucleotides can be updated and redesigned as our knowledge of the true genetic affinity between known organisms improves. In many cases, the hybridization array will be able to determine the genetic affinity of multiple organisms in a sample in one experiment. If the organism turns out to be a previously known organism, its identity can be determined to the species level if suitable signature oligonucleotides are included in the hybridization. Under some circumstances, the signature sequences can also be used in assays which detection does not rely on hybridization.