The present invention relates to methods for performing surveys of the genetic diversity of a population. The invention also relates to methods for performing genetic analyses of a population. The invention further relates to methods for the creation of databases comprising the survey information and the databases created by these methods. The invention also relates to methods for analyzing the information to correlate the presence of nucleic acid markers with desired parameters in a sample. These methods have application in the fields of geochemical exploration, agriculture, bioremediation, environmental analysis, clinical microbiology, forensic science and medicine.
Microbes have been used previously as biosensors to identify chemicals in the environment. For instance, microbes have been utilized as biosensors for the presence of nitrates (Larsen, L. H. et al., 1997, A microscale NO3xe2x88x92 biosensor for environmental applications. Anal. Chem. 69:3527-3531), metals (Virta M. et al., 1998, Bioluminescence-based metal detectors. Methods Mol. Biol. 102:219-229), and a variety of hydrocarbons (Sticher P. et al., 1997, Development and characterization of a whole-cell bioluminescent sensor for bioavailable middle-chain alkanes in contaminated groundwater samples. Appl. Environ. Microbiol. 63(10):4053-4060). In these examples however, the indicator microbes are not native species, but rather, the product of recombinant manipulations designed for specific applications. These modifications involve coupling the nutrient sensing machinery of well-characterized bacterial strains with reporter genes to aid identification. This approach is limited, however, by the metabolic diversity of a few well-characterized bacterial strains. In contrast, the large and diverse pool of microbes in the environment represents a source of biosensors for a much larger range of applications than currently exists. Thus, there is a need to identify and use other microbes, especially those found in situ, as bio sensors. Microbes also have an important impact on health and medicine.
Estimates have been made there may ten times the number of microbial cells associated with the human body as there are human cells. Many microbial cell populations that are associated with the human body play a beneficial role in maintaining health. For instance, gut microflora is important for proper digestion and absorption of nutrients and for production of certain factors, including some vitamins. In general, the human immune system is able to keep the bacterial populations of the human body in check and prevent the overgrowth of beneficial microbial populations and infection by detrimental microbial populations.
Nevertheless, the list of human diseases that are now attributed to microbial pathogens is growing. However, nearly all of the information regarding the relationships between microbes and human disease have been gained from approaches that require culture of microbial species.
Two examples of diseases where the causative agents were identified through molecular methods include bacillary angiomatosis (Relman, D. A. et al., 1990, New Engl. J. Med. 323: 1573) and Whipple""s disease (Wilson, K. H. et al., 1991, Lancet 338: 474). Further, the central aspects of atherosclerosis are consistent with the inflammation that results from infection. DNA sequences from Chlamydia have been identified from atherosclerotic lesions and has led to suggestions that this organism plays a role in the disease.
In addition, bacterial infections have become an increasing health problem because of the advent of antibiotic-resistant strains of bacteria. Further, microbial infections caused by bacteria or fungi that do not usually infect humans may be a problem in immunocompromised individuals. Further, individuals in developing countries who may be malnourished or lack adequate sanitary facilities may also support a large load of opportunistic bacteria, many of which may cause sickness and disease. In veterinary medicine, livestock living in close quarters also may be prey to infections caused by a variety of different types of microbes. Thus, there is a need to develop sensitive methods of identifying many different types of microbes without having to cultivate them first in order to treat or prevent microbial infections in humans and other animals.
Assays for microbial contamination is an important component of food testing as well. A large number of different types of microbes may contaminate food for humans or animals. Thus, an ability to test food for contamination quickly and effectively is critical for maintaining food safety.
However, many of the microbes responsible for causing sickness in humans and animals are difficult to isolate or identify. Assays for microbial populations also has use in fields such as forensic science. Over the past ten to twenty years, scientists have determined that microbial populations change when bodies begin to decay, and have begun to identify certain microbial species that are indicative of decomposition (Lawrence Osborne, Crime-Scene Forensics; Dead Men Talking, New York Times, Dec. 3, 2000). However, only a few microbial species that may be useful in these analyses has been identified.
The problem of determining genetic diversity is not confined to microbial populations. Antibody diversity is critical for a proper immune response. During B cell differentiation, antibody diversity is generated in the heavy and light chains of the immunoglobulin by mechanisms including multiple germ line variable (V) genes, recombination of V gene segments with joining (J) gene segments (V-J recombination) and recombination of V gene segments with D gene segments and J gene segments (V-D-J recombination) as well as recombinational inaccuracies. Furthermore, somatic point mutations that occur during the lifetime of the individual also lead to antibody diversity. Thus, a huge number of different antibody genes coding for antibodies with exquisite specificity can be generated. T cell receptor (TCR) diversity is generated in a similar fashion through recombination between numerous V, D and J segments and recombinational inaccuracies. It has been estimated that 1014 Vxcex4 chains, more than 1013 xcex2 chains and more than 1012 forms of Vxcex1 chains can be made (Roitt, I. et al., Immunology, 3rd Ed., 1993, pages 5.1-5.14). A knowledge of the antibody or TCR diversity in a particular individual would be useful for diagnosis of disease, such as autoimmune disease, or for potential treatment.
The identification of microbes, especially soil microbes, has traditionally relied upon culture-dependent methods, whereby the detection of a microbial species depends upon the ability to find laboratory conditions that support its growth. To this end, 96-well plates have been commercially developed to identify microbes with different metabolic requirements. For instance, BioLog plates incorporate 96 different media formulations into the wells of a 96-well plate.
Despite these efforts, it is now accepted that far fewer than 1% of microbes can propagate under laboratory conditions (Amann, R. I. et al., 1995. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59:143-169).
The widespread interest in genomics has created many exciting new technologies for the parallel quantitation of thousands of distinct nucleic acid sequences simultaneously. While still in their infancy, these technologies have provided unprecedented insight into biology. To date, these technologies have predominately been utilized in pharmaceutical and agricultural applications. Genome expression profiling has gained general acceptance in biology and is likely to become commonplace in all academic, biotechnology and pharmaceutical institutions in the 21st century. For instance, Serial Analysis of Gene Expression (SAGE) is a hybridization-independent method designed to quantitate changes in gene expression (Velculescu, V. E. et al., 1995, Serial analysis of gene expression. Science 270:484-487 and U.S. Pat. No. 5,866,330). However, SAGE only measures RNA levels from tissues or organisms, and is not suitable for examining genetic diversity.
The widespread interest in genomics has also led to the development of many technologies for the rapid analysis of tens of thousands of nucleic acid sequences. One such technology is the DNA chip. Although this approach had been used as a diagnostic for distinguishing between several species of the genus Mycobacterium (Troesch, A., et al., 1999, Mycobacterium species identification and rifampin resistance testing with high-density DNA probe arrays. J Clin. Microbiol. 37:49-55), it has limited utility for an environmental microbial survey for two reasons. First, the sequence of the target DNAs to be analyzed must be known in order to synthesize the complementary probes on the chip. However, the vast majority of environmental microbes have not been characterized. Second, DNA chips rely on hybridization of nucleic acids which is subject to cross hybridization from DNA molecules with similar sequence. However, the resolving power of a hybridization-based approach is limited because one must identify regions of DNA that do not cross-hybridize, which may be difficult for related microbial species.
Genomic technologies and bioinformatics hold much untapped potential for application in other areas of biology, especially in the field of microbiology. However, to date there has not been a method to rapidly and easily determine the genomic diversity of a population, such as a microbial or viral population. Further, there has not been a method to easily determine the antibody or TCR diversity of a population of B or T cells, respectively. Thus, there remains a need to develop such methods in these areas.
The present invention solves this problem by providing methods for rapidly determining the diversity of a microbial or viral population and for determining the antibody or TCR diversity of a population of B or T cells. The present invention relies on hybridization-independent genomic technology to quickly xe2x80x9ccapturexe2x80x9d a portion of a designated polymorphic region from a given DNA molecule present in a population of organisms or cells. This portion of the DNA molecule, a xe2x80x9cmarker,xe2x80x9d is characteristic of a particular genome in the population of interest. The marker can be easily manipulated by standard molecular biological techniques and sequenced. The sequence of a multitude of markers provides a measure of the diversity and/or identity of a population. In one aspect, the invention provides a method, Serial Analysis of Ribosomal DNA (SARD), that can be used to distinguish different members of a microbial population of interest.
In another aspect, the invention provides a method for analyzing a designated polymorphic region from a population of related viruses using method steps similar to those described for SARD. In a further aspect, the invention provides a method for analyzing the variable regions from the immunoglobulins or TCR genes of a population of immune cells using methods steps similar to those described for SARD.
In another aspect of the invention, a method is provided for analyzing a population based upon an array of the masses of peptides that are encoded by polymorphic sequences of particular DNA molecules in a region of interest. In a preferred embodiment, the region of interest is a designated polymorphic region from an rDNA gene from each member of a microbial population.
In another aspect of the invention, a method is provided for analyzing the information provided by the above-described methods. The method enables the creation of a diversity profile for a given population. A collection of diversity profiles provides an accurate representation of the members present in a population. These diversity profiles can be entered into a database along with other information about the population. The diversity profiles can be used with various correlation analyses to identify individual, or sets of individuals that correlate with each other. The correlation analyses can be used for diagnostic or other purposes.
In another aspect, the invention provides databases comprising various diversity profiles. In a preferred embodiment, the diversity profile is obtained by SARD.
In yet another aspect of the invention, a method is provided for identifying a diversity profile, as described above, that correlates with a parameter of interest. In a preferred embodiment, the diversity profile is a profile of the microbial populations that correlate with the presence of mineral deposits and/or petroleum reserves. In another preferred embodiment, the diversity profile is a profile of populations of different antibodies or TCR that correlate with a specific disease state, such as an autoimmune disorder.
In a still further aspect, the invention provides a method for locating mineral deposits or petroleum reserves comprising identifying one or more nucleic acid markers that correlate with the presence or mineral deposits or petroleum reserves, isolating nucleic acid molecules from an environmental sample, determining whether the nucleic acid markers are present in the environmental sample, wherein if the nucleic acid markers are present, then the area from which the environmental sample was obtained is likely to have mineral deposits or petroleum reserves.