The present invention relates to an improved and efficient method for the simultaneous monitoring of individual mutants of a microbe in mixed populations. Such mutants are distinguished from each other by utilizing the features of the mutated genes themselves. More particularly, by the method of this invention even mutants having subtle quantitative phenotypes, or those without plate screens can be readily monitored quantitatively. This would facilitate in understanding the role of a large number of novel genes identified by the systematic sequencing of microbial genomes, many of which may have only subtle quantitative phenotypes. It would also readily allow the parallel screening of large number of mutated genes for their role under conditions which by their very nature can not have a plate screen; examples include virulence genes of pathogenic microbes, and genes conferring stress tolerance to yeast cells during fermentation in liquid broths. This method has pronounced application in discovering the functions of genes of large number of microbes of medical, industrial and agricultural importance.
Isolation and study of mutants impaired in normal cellular phenomena is a standard way to dissect them out at the genetic and biochemical levels, and finally to understand them at the molecular level. The conventional approach to screen mutants is on solid media set in petri-plates. If a mutant has increased growth or survival (positive phenotype) compared to the normal wild-type cells, then it can be easily selected on a plate, even from amongst a lawn of wild-type cells. It can also be enriched from a mixed population in a liquid broth by repeated selection. However, if a mutant cell shows reduced growth or survival (negative phenotype) under the selection condition, then it can not be identified from among a mixed population of cells in liquid broths; yet, a large number of such potential mutant cells can be allowed to form isolated colonies on solid media under non-selective conditions, replica-plated on to selective media and then growth assessed. On the other hand, if there is no plate screen for the phenotype, i.e. if the phenotype does not show up on solid media, then screening by replica plating is not possible. One such example is mutants impaired in stress tolerance under fermentation conditions in liquid broths. Another example is from pathogenic microbes, where to isolate mutants impaired in virulence, one has to individually test potential mutants for pathogenicity in the host organism, which is extremely laborious. Thus, conventional mutant screening methods are very inadequate to identify genes whose mutant phenotype does not show up on solid media.
Four methods have been reported which partially redress this problem, though they remain quite laborious (Hensel et al., 1995, Science 269: 400-403; Shoemaker et al., 1996, Nature genetics 14: 450-456; Smith et al., 1995, Proc. Natl. Acad. Sci. USA 92: 6479-6483; Cormack et al., Science 285: 578-582, 1999). In the first method, known as xe2x80x98signature tagged transposon mutagenesisxe2x80x99 (Hensel et al., 1995, Science 269: 400-403), random mutants are generated by insertional inactivation of genes by transposons. Prior to this step, the transposons are uniquely marked with random sequence tags. Thus, the mutated genes are tagged with transposons, which in turn are tagged with unique, but random sequence tags; the mutants could be individually monitored in mixed populations by means of the sequence tags. This method was developed and used for identifying bacterial virulence genes. While this is a major advance for screening mutants having negative phenotypes in mixed populations, it suffers from three disadvantages, namely, need for prior introduction of sequence tags, poor sensitivity (only about 100 mutants can be pooled together and screened), and inability to monitor the phenotypes quantitatively.
In the second method known as xe2x80x98molecular bar codingxe2x80x99 (Shoemaker et al., 1996, Nature genetics 14: 450-456), each mutated gene is marked with a unique and known sequence tag. This is carried out by replacing the coding sequence of a gene with a selectable marker and a sequence tag, by transformation with PCR (polymerase chain reaction) products having small regions of homology to genes being deleted. Once a large collection of strains are created with each mutated for a single gene and also carrying a unique sequence tag, all of them can be individually monitored in mixed populations by means of their sequence tags. This method is very powerful and can facilitate the quantitative monitoring of the fate of thousands of mutants simultaneously under any selection condition. However, the initial construction of the set of mutants is extremely laborious and time consuming. It is also very expensive, since for each gene to be deleted, a set of long oligos have to be custom synthesized. A prerequisite of this method is that the nucleotide sequence of the genes being deleted should be known. Thus, if the aim is to delete all the genes of a microbe, then the entire sequence of its genome should be determined in the first place. Another important requirement is that the microbe should have a good homologous recombination system for efficiently replacing the native genes with the deleted versions having minimal length of sequence homology. The last requirement may turn out to be insurmountable for a large number of microbes. At present only yeast Saccharomyces cerevisiae has been taken up to be studied by this method; the construction of deletion strains is currently being carried out by a large collaboration involving eight American and European laboratories. However, this method is unlikely to be used for studying many important microbes particularly due to the lack of an efficient homologous recombination system, and also due to the cost, time and labor involved.
In the third method a variation of molecular bar coding is used. Here, to begin with, 96 different isogenic parent strains are constructed by introducing unique sequence tags for each (Cormack et al., Science 285: 578-582, 1999). These are then mutated by random insertion of a transforming DNA in the genome. Then pools of 96 mutants each are made, where each mutant in the pool has a unique sequence tag. These are then distinguished from each other in mixed populations by hybridization. The limitation of this method is the need for initial introduction of unique sequence tags, and the need for doing large number of hybridizations. In our method there is no need for introduction of sequence tags, making it more economical, less laborious and faster than existing methods.
In the fourth method known as xe2x80x98genetic footprintingxe2x80x99 (Smith et al., 1995, Proc. Natl. Acad. Sci. USA 92: 6479-6483), random population of mutants are obtained by transposon mutagenesis. However, the mutants are not uniquely marked with any specific sequence tag. Instead, the fate of each mutant during selection is individually analyzed by PCR, with a gene-specific primer and a transposon specific primer. If a particular PCR product corresponding to a mutant gene is present in the starting population of cells, but absent or reduced in the population subjected to selection, then that would indicate that mutants in that gene do not survive the selection. Though with this method one can identify the genes conferring subtle quantitative phenotypes (Smith et al., 1996, Science 274: 2069-2074), it suffers from the need for gene-specific primers and from the need to do individual PCR reactions for each gene of interest. These two requirements make this method extremely laborious and expensivexe2x80x94to comprehensively identify all the genes providing some benefit to a microbe under a selection condition, it is necessary to do several thousand PCR reactions. Since so many reactions have to be done for each selection condition, this method is extremely labor intensive, time consuming and costly. Besides, prior sequence information is necessary for designing gene-specific primers, and thus applicable only to microbes whose genomes are fully sequenced.
Thus, there is a strongly felt need for a method which is less laborious, less time consuming, less demanding in terms of the prior sequence information of the genome, and also less dependent on the homologous recombination system of the organism. Such a method should be capable of detecting quantitative differences in phenotype, and also allow the isolation of mutants which do not have a plate-screen. The conventional mutant screening methods on solid media are deficient in their ability to detect subtle quantitative phenotypes. Besides, as the mutants are essentially kept as isolated colonies, there is not much competition among mutant and wild-type cells for subtle differences in fitness to show up. This can be alleviated if the mutants are screened in mixed population of cells, and different mutants individually monitored by some feature of their genotype. The complete sequencing of several microbial genomes has revealed thousands of novel genes of unknown function. Even in such well studied organisms such as yeast and E. coli, about one third of the genes are novel. Obviously, mutants in these genes did not show up in the conventional mutant hunts possibly due to the very nature of the phenotypes conferred by these genes. It appears likely that many of these genes make only subtle/marginal contributions to the fitness of the microbe and thus were missed in conventional screens. Indeed, many of them, when appropriately tested, were found to make subtle, but nevertheless significant contributions to the fitness of the organism (Smith et al., 1996, Science 274: 2069-2074; Thatcher et al., 1998, Proc. Natl. Acad. Sci. USA, 95: 253-257). Thus, an improved method should be capable of directly identifying mutants that may have only such quantitative phenotypes. Another reason for such a method is that the phenotype of some class of mutants may show up only under some special conditions and not on solid media, e.g. in liquid broths for fermentation, and, in host organisms for mutants impaired in virulence.
The main object of the present invention is to provide an improved and efficient method for the simultaneous monitoring of the abundance of individual mutants of a microbe in mixed populations.
Another object is to use this method to quantitatively trace the abundance of known mutants in mixed populations to follow their fitness under various environmental conditions.
Yet another object is,to use this process to identify novel genes conferring quantitative or difficult-to-screen phenotypes, and thereby assign function to these genes.
Accordingly, the present invention provides an improved and efficient method for the simultaneous monitoring of the abundance of individual mutants of a microbe in mixed populations, which comprises,
i) generating a population of mutants by the random insertion of a known transposon in the genome of a microbe such that each mutant will preferentially carry only a single transposon insertion, by known methods,
ii) isolating the total genomic DNA of the mixed population of mutants by known methods,
iii) fragmenting the genomic DNA with a frequently cutting restriction enzyme, by known methods,
iv) ligating a double stranded adapter to the genomic DNA fragments, by known methods,
v) amplifying only the DNA fragments adjoining transposon insertions by PCR, specifically and quantitatively, using a transposon specific primer and an adapter specific primer, thereby generating a set of DNA fragments corresponding only to the mutated genes of the population of mutants,
vi) resolving the amplified DNA fragments according to their size, by known methods,
vii) comparing the intensity of the DNA fragments obtained from the population of mutants before selection with that obtained from the population subjected to selection, thereby monitoring the abundance of individual mutants by means of the intensity of the corresponding DNA fragments, to quantitatively follow the abundance of the mutants during selection, and,
viii) sequencing the DNA fragments that change in abundance, by known methods, to identify the mutated genes.
In an embodiment of the present invention insertional mutagenesis can be carried out preferably by transposons or by the random insertion of foreign DNA introduced by transformation or by other means.
In another embodiment of the present invention known mutants can be constructed individually, and then mixed and studied together to accurately and quantitatively monitor their phenotypes under various selection conditions, particularly those without any plate screen.
In yet another embodiment of the present invention mutants of selected genes of a microbe homologous to unknown human genes can be constructed, and then mixed and studied together to accurately and quantitatively monitor their phenotypes under various selection conditions, thereby to identify the biological function of these genes.
In yet another embodiment of the present invention the amplified DNA fragments can be labeled to high specific activity and used as hybridization probes to screen colony blots; by screening duplicate blots, one with DNA fragments corresponding to the starting population of mutants, and another with DNA fragments corresponding to the selected population of mutants, clones that change in abundance can be identified and characterized to identify the mutated genes.
In yet another embodiment of the present invention the amplified DNA fragments can be hybridized to gene-filters or DNA chips where DNA corresponding to all the genes of an organism are spotted on filters or glass slides at known locations. By comparing the intensity of the signals obtained with the DNA fragments of mutants before and after selection, one can directly identify the genes important under the selection conditions, particularly for which there is no plate screen.
In yet another embodiment of the present invention this method can be modified to monitor the genes carried on plasmids, and thereby quantitatively monitor the cells carrying such plasmids under varied selection conditions.
In yet another embodiment of the present invention this method can be used to monitor the abundance of different kinds of DNA molecules in a DNA preparation.
The details of the method of the present invention are as follows:
The first step of the process is the generation of a large collection of mutants by insertional inactivation of genes by known transposon tagging methods. In yeast Saccharomyces cerevisiae this can be achieved either by Ty mutagenesis (Garfinkel and Strathem, 1991, Methods in Enzymology 194: 342-361) or by shuttle mutagenesis (Hoekstra et al., 1991, Methods in Enzymology 194: 329-342). Ty mutagenesis uses a modified yeast transposon which can be induced to transpose at a high frequency under the control of a galactose inducible promoter. One limitation of this method is that the Ty elements tend to insert preferentially near tRNA genes. An ideal insertion mutagenesis system is one which randomly inactivates the genes of a microbe without any bias for target sequences. Shuttle mutagenesis, based on a Tn3 based minitransposon, has minimum bias for targets. In this method, yeast genomic DNA, being propagated in E. coli as part of a genomic library, is transposon mutagenized and then introduced into yeast by transformation. The transformation is done with a low amount of DNA to ensure that each transformant receives only a single mutation. Since different transformants will receive different mutated regions of the yeast genome, a large collection of such transformants will represent mutants having mutations in almost all the non-essential genes of yeast. Shuttle mutagenesis is applicable to any microbe that can be transformed, and in which gene-disruption can be carried out by homologous recombination. It may also be possible to introduce random insertion mutations into the genome of a microbe by transforming with a foreign DNA lacking any homology to its genome. The transposon (or the foreign DNA), besides mutating the target gene also serves as a sequence tag. However, since all the mutations have the same tag, in the process of the present invention they are distinguished from each other by the sequence features of the mutated genes themselves. To achieve this, the DNA flanking the transposon insertions are selectively and quantitatively amplified as follows.
Genomic DNA from the mixed population of yeast mutants can be isolated by using standard methods (Kaiser et al., 1994, Methods in yeast genetics, Cold spring harbor laboratory press). The average size of the purified genomic DNA should be larger than 20 kb, and it should be free of any deoxyribonuclease contamination. If the DNA is isolated from a mixed population of mutants, each mutant should be represented by tens of thousands of cells so as to reduce sampling error.
The genomic DNA is then cut with a frequently cutting restriction enzyme; the enzyme chosen should be such that most of the DNA fragments obtained should be below 300 bp in size. These are then ligated with a double stranded adapter having one end compatible with the ends generated by the restriction enzyme. This adapter is composed of two oligonucleotides which are not phosphorylated. Besides, their sequence should not be similar to any region of the genome of the microbe being studied. This can be ensured if the entire sequence of the microbe""s genome is known; even if the sequence is not known, if a sufficiently long adapter is chosen then it is very likely to be a unique sequence and different from the sequence of the microbe""s genome. In the process of the present invention, the adapters used are similar to those used for selective amplification of random restriction fragments (Vos et al., 1995, Nucleic Acids Research 23: 4407-4414); however, in the process of this invention the subsequent steps are modified such that instead of random restriction fragments, only those restriction fragments adjoining transposon insertions are amplified, as described below.
Selective amplification of DNA fragments adjoinig transposon insertions is achieved by the use of a primer specific for the terminus of the transposon and a primer specific for the adapter sequence. Besides, the PCR conditions are optimized such that the restriction fragments that carry adapter sequences at both the termini are not amplified: As neither of the oligos of the adapter is phosphorylated, only one of the adapter oligos actually gets ligated to the ends of restriction fragments (through the 5xe2x80x2-phosphate of the restriction fragments). There is a denaturation step prior to the first cycle of PCR which ensures that the unligated adapter oligo falls away; besides, the concentration of template DNA molecules in the reaction is such that they do not renature during the one minute time provided for the annealing of primers during PCR. As the adapter primer used in PCR is of the same sense as the adapter oligo ligated to the ends of restriction fragments, it can not anneal and initiate DNA synthesis unless the complementary strand is first made, which will occur only if there is a binding site for the transposon specific primer. This ensures that only those DNA fragments adjoining transposon insertions get amplified. To further increase the specificity, the annealing temperature is kept as high as possible to prevent mispriming. Besides, nested or semi-nested PCR is done using primers that will anneal and amplify only those DNA fragments that were specifically synthesized during primary PCR, and not the spurious amplification products. These steps together ensure that only those DNA fragments abutting transposon insertions, which actually correspond to the mutated genes, get specifically amplified. The PCR is also designed such that it is quantitative, i.e., the concentration of the amplified DNA fragments is proportional to the concentration of the initial template molecules (which in turn is proportional to the abundance of the mutants). This is ensured by having a low concentration of primers in the reaction, which results in the amplification stopping due to the exhaustion of primers, and not due to the exhaustion of any other component of the PCR reaction. Besides, when multiple products are made, the concentration of any particular product is not so high as to prevent quantitative amplification of that particular product by product-product reannealing. It may also be possible to modify the procedures published for PCR amplifying DNA of an unknown sequence adjoining a region of known sequence, such as suppression PCR (Siebert et al., 1995, Nucleic Acids Res. 23: 1087-1088) or ligation mediated PCR methods (Mueller and Wold, 1989, Science 246: 780-786; Pfeifer et al., 1989, Science 246: 810-813; Palittapongarnpim et al., 1993, Nucleic Acids Res. 21: 761-762; Prod""hom et al., 1998, FEMS Microbiol Lett 158: 75-81), for this purpose. However, these methods were used for amplifying only a single or a limited number of DNA fragments in a single PCR reaction, without much emphasis on quantitative amplification. Thus, much optimization may be necessary before any of them can be used for quantitative amplification of mutated genes from a mixed population of large number of mutants, as has been done in the process of the present invention.
Once DNA fragments corresponding to the mutated genes are amplified they can be distinguished from each other by their size after resolving them in a high-resolution sequencing gel. This is possible due to the variable position of restriction sites with respect to transposon insertions in different genes. Even if there is some overlap between fragments, they can be resolved if a different restriction enzyme is used for the initial digestion of genomic DNA. By comparing the intensity of DNA fragments obtained before and after selection, the fate of the mutants during selection can be quantitatively monitored. That is, if a DNA band corresponding to a mutant has changed in intensity, then it would indicate that the abundance of the corresponding mutant has changed during selection. If one is dealing with a population of known mutants, then the identity of the band and that of the mutated gene will be known beforehand, and thus the role of the gene under the selection conditions can be quantitatively determined. If one is dealing with a population of unknown mutants, then the identity of the mutated gene can be found out by sequencing the DNA after isolating it from the relevant band and after reamplifying the same. The DNA fragments corresponding to the mutated genes can also be distinguished from each other by hybridizing to colony blots of a genomic library of the microbe under study. By hybridizing duplicate blots, one with the probe made from DNA fragments of the starting population of mutants, and the other with that of the selected population, one can identify the clones whose genes play some role during selection. The identity of the relevant gene can then be found out by sequencing the clone. Similar hybridizations can be carried out with gene filters or DNA chips where DNA corresponding to almost all the genes of a microbe such as yeast Saccharomyces cerevisiae are arrayed at known locations. Thus, from the position of the signal itself one can know the identity of the gene. It may also be possible to further simplify the screening to a single hybridization, by first subtracting the DNA fragments of selected population from that of starting population and using the remaining DNA as probe. Subtraction of DNA fragments can be done by representational difference analysis (Lisitsyn et al., 1993, Science 259: 946-951) or by suppression subtractive hybridization (Diatchenko et al., 1996, Proc. Natl. Acad. Sci. USA 93: 6025-6030).
The following examples are given by way of illustration of the present invention and should not be construed to limit the scope of the present invention.