Traditionally, the cloning of useful metabolic genes has been performed either through a direct genetic approach or by the "reverse genetics" approach. These methods involve purification of an enzyme of interest followed by the identification of its gene through the use of antibodies or amino acid sequence information obtained from the pure protein.
Although both strategies are routinely used, they are often limited by technical problems. The genetic approach can only be used for organisms that have a developed genetic system or whose genes can be expressed in heterologous hosts. The reverse genetics approach requires the purification of the protein of interest, amino acid sequencing, further determination of DNA sequence and amplification of a DNA probe from degenerate primers. Both approaches are time consuming and inefficient.
Recently, mRNA techniques that can be employed to access regulated genes directly in the absence of a genetic system and without the purification of their gene products have been disclosed. These approaches are based on the comparison of the mRNA population between two cultures or tissues, and further identification of the genes or a subset of genes whose mRNA is more abundant under conditions of induction. These techniques rely on various methods including: 1) hybridization of labeled mRNAs onto arrays of DNA on membranes (Chuang et al., J. Bacteriol. 175:5242-5252 (1993)), 2) DNA microarrays (Duggan et al., Nat. Genet. 21:10-14 (1999)), 3) large scale sample sequencing of EST libraries (Rafalski et al., Acta Biochimica Polonica 45:929-934 (1998)), and 4) the sampling of mRNA by the production of randomly amplified DNA fragments by reverse transcription followed by polymerase chain reaction (RT-PCR).
Two variations of sampling of mRNA by the production of arbitrarily amplified DNA fragments by reverse transcription followed by RT-PCR have been published. The first one, differential display per se, (DD) (Liang et al., Science 257:967-971 (1992), Liang et al., Nucleic Acids Res. 21:3269-3275 (1993)) starts with the synthesis of cDNAs by reverse transcription of mRNA using a poly-dT primer that hybridizes to the poly-A tail of eukaryotic messages. Synthesis of the second DNA strand is then initiated at random sites under low stringency using an oligonucleotide of arbitrary sequence. Subsequent exponential amplification by PCR yields a series of DNA fragments in a process essentially identical to that of random amplification of polymorphic DNA (RAPD) (Williams et al., Nucleic Acids Res. 18:6531-6535 (1990)). This technique is commonly used for eukaryotic applications.
The second method uses an arbitrary oligonucleotide primer to initiate reverse transcription of the message at random sites. This technique is independent of poly(A) tails, and can be used for both eukaryotic and procaryotic cells (Welsh et al., Nucleic Acids Res. 20:4965-4970 (1992)). In spite of this teaching only a handful of prokaryotic applications of DD have been published to date, (Abu Kwaik et al., Mol. Microbiol. 21:543-556 (1996); Fleming et al., Appl. Environ. Microbiol. 64:3698-3706 (1998); Wong et al., Proc. Natl. Acad. Sci. USA 91:639-643 (1994); Yuk et al., Mol. Microbiol. 28:945-959 (1998)); Zhang et al., Science 273:1234-1236 (1996)), suggesting difficulties with the method.
The above cited methods are useful for the identification of selected inducible genes, however, suffer from several drawbacks when applied to the problem of identifying gene clusters and metabolic pathways, particularly in prokaryotic organisms. These drawbacks include: (i) the short half life of prokaryotic mRNA make any mRNA-based experiment more difficult than in eukaryotic systems, (ii) differential display often results in a high number of false positives and (iii) current literature protocols are very cumbersome and time consuming. No method is available which addresses these drawbacks and definitively distinguishes between false positives and those gene which are are truly differentially expressed.
The problem to be solved, therefore is to develop a reliable system for identifying inducible genes in prokaryotic systems. Applicants have solved the stated problem by providing a method for high density sampling of a mRNA population using a large number of arbitrary primers where a single mRNA molecule is sampled repeatedly in independent RT-PCR reactions. The present invention represents a significant advance in the art, as the literature teaches only applications of differential display which use a small set of primers in a single RT-PCR reaction to generate many differentially amplified bands corresponding to differentially expressed genes which is then analyzed by long high resolution sequencing gels (Liang et al., Science 257:967-971 (1992), Wong et al., Proc. Natl. Acad. Sci. USA 91:639-643 (1994), Fleming et al., Appl. Environ. Microbiol. 64:3698-3706 (1998)). Using this method Applicants were able to identify 21 induced gene fragments, all of which were functionally related. To date, the greatest number of primers used in a similar method is 32 (Rivera-Marrero et al., Microb Pathog 25 (6):307 (1998)), resulting in only the identification of 4 induced genes. Abu Kwaik et al., (Mol. Microbiol. 21:543-556 (1996), using 30 primers was only able to identify 1 induced gene.
The present method of multiple sampling of RNA is particularly suitable for prokaryotic applications where RNA messages are polycistronic and thus constitute a larger target for arbitrary amplification by RT-PCR and which would permit the identification of more full length genes.