Research toward improving the ability to detect and identify microbial genomes has risen to prominence in part because of its application to defense against bio-terrorism and biological warfare. The steadily rising numbers of sequenced microbial genomes is also giving impetus to studies of natural populations in soil and water, with a view to understanding community composition and dynamics. Understanding of microbial community dynamics is also important in the field of infectious disease health care, particularly in view of the rise in the prevalence of antibiotic resistant strains of microorganisms. In each of these scenarios, genomic information needs to be sufficiently detailed to distinguish among strains, and needs to provide a quantitative measure of the relative abundance of individual genomes in a sample.
In the last twenty years, a variety of DNA-based techniques have been developed to allow comparisons of whole genomes. Perhaps one of the simplest approaches involves electrophoretic separation in two dimensions to separate restriction fragments. Fischer et al. (Cell 16:191-200 (1979)) combined size separation in the first dimension with mobility in a denaturating gradient in the second dimension, to effectively separate and then probe whole-genome restriction digests.
A PCR-based method to generate fingerprint profiles of bacterial DNA by amplifying fragments generated by cutting at rare restriction sites has been developed (Masny et al. (1991) Biotechniques 31:930-936), but utility is limited to analysis of relatively small fragments.
Restriction landmark genome scanning (RLGS) is a related method in which genomic DNA is end-labeled at sites generated by cleavage with a rare-cutting restriction enzyme, followed by gel electrophoretic size separation. The fragments are cleaved in situ with a second, more frequently cutting restriction enzyme and subjected to second-dimension electrophoresis to resolve the end-labeled fragments.
Recently, Rouillard et al. (Genome Res 11:1453-1459 (2001)) developed a software tool designated virtual genome scan (VGS), that makes it possible to predict automatically the sequence of first dimension NotI plus EcoRV fragments, and second dimension HinfI or DpnII fragments in RLGS patterns of total human DNA, by matching fragment mobilities to those predicted from the draft human genome sequence. The utility of this method was demonstrated by its ability to identify a specific NotI-EcoRV fragment from human chromosome 1 that is frequently absent from restriction digests of neuroblastoma cells. Sequence prediction by VGS, as well as cloning of the fragment, showed that it contained a CpG island that is part of the human orthologue of the hamster homeobox gene Al×3 (Wimmer et al. (1992) Genes Chromosomes Cancer 33:285-94).
While VGS can provide a limited global survey for the presence or absence of a particular DNA fragment, it cannot directly identify novel sequences. VGS can be viewed as a closed architecture technique since it is inherently retrospective, relying on pre-established sequence information.
The above described methods are tools for fingerprinting genomes of individual organisms. The methods are dependent upon the integrity of the starting DNA, the completeness of the digestion by the restriction enzymes and the reproducibility of the electrophoretic separation procedures. In addition, it would be unlikely that the methods would be applicable to identifying and quantitating the multiplicity of organisms in a natural, e.g., environmental, sample.
An open architecture, comprehensive, DNA-based method for the identification and quantitation of organisms in a sample (i.e., the organismic complexity of a sample) would find many applications. One such application relates to the identification and quantitation of organisms adapted for bioterrorist activities, while another relates to the identification and quantitation of organisms comprising a biofilm or those contained in a biological or other natural specimen and the dynamic population changes occurring in such samples over time. Population differences and changes in spatial distributions of organisms can also be demonstrated with such a system. Such a system would be particularly advantageous for identifying and quantifying organisms that are difficult to cultivate.