Not applicable
Not applicable
Each of the applications and patents cited in this text, as well as each document or reference cited in each of these applications and patents (including during the prosecution of each issued patent; xe2x80x9capplication cited documentsxe2x80x9d), and each of the PCT and foreign applications or patents corresponding to and/or claiming priority from any of these applications and patents, and each of the documents cited or referenced in each of the application cited documents, are hereby expressly incorporated herein by reference in their entirety. More generally, documents or references are cited in this text; and, each of these documents or references (xe2x80x9cherein-cited documents or referencesxe2x80x9d), as well as each document or reference cited in each of the herein-cited documents or references (including any manufacturer""s specifications, instructions, etc.), is hereby expressly incorporated herein by reference. Further, various references are cited by their WWW addresses and the content of these references are also expressly incorporated herein by reference.
The present invention discloses methods and materials that efficiently normalize cDNA libraries. The present invention also discloses methods and materials for aiding subtractive/differential hybridization and other normalization procedures. The methods and materials can be packaged in the form of a kit. The present invention supports a wide variety of genetic applications, including the isolation, identification and analysis of genes, the analysis and diagnosis of disease states, the study of cellular differentiation, and gene therapy.
Approximately 20,000 genes are expressed in a typical mammalian tissue. However, not all genes are expressed in equal copy numbers. A wide range of gene expression patterns and/or levels are found among different cell types, or during stages of development.
Genes that are transcribed in many copies are categorized as xe2x80x9chighly expressed genesxe2x80x9d or xe2x80x9chigh copy number genes.xe2x80x9d High copy number genes are often associated with maintenance of basic cellular functions, and are therefore known as xe2x80x9chouse-keepingxe2x80x9d genes. Transcription of house-keeping genes is usually constitutive.
Genes that are transcribed in fewer copies are categorized as xe2x80x9cmoderate or rarely expressed genesxe2x80x9d or xe2x80x9cmedium or low copy number genes.xe2x80x9d Transcription of medium or low copy number genes is often subject to regulation, giving rise to differential patterns of expression. A regulated or restricted pattern of expression can be indicative of a unique gene function. For example, expression of VEGF, which plays a critical role in formation of the vascular endothelium, is restricted to the vascular endothelium. Mice that lack this gene have an embryonic lethal phenotype. De-regulation of differential gene expression is associated with many diseases, most notably, cancer.
Cloning of low copy number genes is difficult. Cloning involves screening of genetic libraries, such as cDNA or genomic libraries, using a polynucleotide complementary to the target gene. Prohibitive levels of background in many cDNA libraries lead to repetitive screening and/or sequencing of large numbers of clones. When cDNA libraries are used, hybridization and related screening procedures would be optimized by reducing the amount of high copy number genes or xe2x80x9cbackgroundxe2x80x9d in the library. Presently, there are significant problems with the techniques available for improving the efficiency of cDNA library screening.
For example, a rat liver cDNA library is dominated by fewer than fifty highly/moderately expressed genes, which in turn constitute nearly 50% of the cloned genes (http://www.ncbi.nlm.nih.gov/UniGene/lib.cgi?ORG=RnandLID=31) and create a large xe2x80x9cbackgroundxe2x80x9d against which low copy number genes must be selected. Consequently, rarely expressed genes, which are often the focus of research efforts to isolate disease genes, are xe2x80x9cburiedxe2x80x9d among the background.
xe2x80x9cNormalizationxe2x80x9d procedures reduce the redundancy of highly expressed genes, or background, in cDNA libraries, thereby increasing the relative amount of transcripts represented by rarely expressed genes. Previous normalization procedures concern annealing opposite strands of nucleic acids. That is, the higher the concentration of a nucleic acid fragment, the higher the probability that it will anneal to its complementary fragment. Thus, annealing occurs more rapidly to a high copy number transcript than a low copy number transcript.
Soares et al. (1994) Proc. Natl. Acad. Sci. USA 91:9228-32 concerns such a procedure, where the unannealed, more rarely expressed single-stranded nucleic acid population is separated from the more highly expressed double-stranded population. The separation method involves hydroxyapatite column chromatography, wherein the double-stranded DNA selectively binds to hydroxyapatite. The single-stranded DNA is recovered from the flow-through fraction, processed and cloned in bacteria. Despite claims of high xe2x80x9cnormalization efficiencyxe2x80x9d, this method is cumbersome to use, requires high amounts of input DNA, involves several reaction steps, and results in a loss of material from failure to fully elute single-stranded DNA from the column. The end result is an incomplete reduction of genetic redundancy (or poor normalization). http://www.ncbi.nlm.nih.gov/dbEST/index.html provides public cDNA libraries.
Another normalization method concerns digestion by restriction endonucleases, wherein the preferential target is double-stranded DNA. This method is disclosed in xe2x80x9cNormalizing cDNA libraries using the Eppendorf Thermomixerxe2x80x9d by Scheinert and Schalk; Bernhard-Nocht-Inst. of Tropical Medicine, Virology Dept., Hamburg, Del.; Eppendorf products, application catalogue and http://www.eppendorf.com/prepa/page8.html.
Another method is enzymatic degrading subtraction (EDS) for construction of subtractive libraries from PCR amplified cDNA. Zeng et al. (1994) Nucl. Acids Res. 22:4381-4385. The tester DNA is blocked by thionucleotide incorporation, the rate of hybridization is accelerated by phenol-emulsion reassociation, and the driver cDNA and double-stranded hybrid molecules are enzymatically removed by digestion with exonucleases III and VII rather than by physical partitioning. Here, double-stranded DNA represents the more highly expressed genes, having a higher probability for annealing to its complementary fragment. EDS has been used to construct a substance library enriched for cDNAs expressed in adult but not embryonic rat brains.
Yet another normalization method involves hybridization to genomic DNA coated onto beads. In genomic DNA, all genes are essentially present in the same copy number, and thus highly expressed genes will hybridize to genomic DNA in the same copy number as rarely expressed genes. Coche (1997) Met. Mol. Biol. 67:359-369. However, this method suffers from many of the same shortcomings as hydroxyapatite-column-chromatography separation methods discussed above. Consequently, this method is not widely used for normalization, but for selecting cDNA encoded by a chromosome or genomic DNA fragment of interest.
Suppression Subtractive Hybridization (SSH) and other subtraction methods preserve the copy number difference in the subtracted population leading to redundancy. A modified SSH method was developed by Diachenko et al. (1999) Met. Enzymol. 303:349-380. The modified SSH attempts normalization and subtraction in one reaction.
In the end, hybridization-based methods that depend on the efficiency of hybridization and the sensitivity of the separation/selection techniques, are affected by or influenced by variability in sources of RNA, and thus lack reproducibility in practice. Moreover, current normalization methods are not recommended for full-length library preparation because they either require high temperature treatment to denature and renature the DNAs, which break longer DNA strands, or polymerase chain reaction (PCR) which preferentially amplifies shorter fragments. Genetic research requires better solutions to the problem of redundancy in cDNA libraries.
The present invention discloses novel methods and materials for normalizing nucleic-acid material. These methods and materials are based on a technique called xe2x80x9cprime and kill.xe2x80x9d Unlike previous methods, prime and kill based methods or procedures act by preventing cDNA synthesis either by physical blockage or by cleavage of the RNA. The invention comprises two modes, the xe2x80x9cKiller Primer modexe2x80x9d and the xe2x80x9cconventional mode.xe2x80x9d
The killing reaction in Killer Primer mode involves the formation of RNA-DNA duplexes. The RNA is mRNA encoding the target and non-target genes, the DNA oligonucleotides are of preselected sequences, complementary to the 3xe2x80x2 end of target RNA transcripts of highly and/or moderately expressed genes. Once the duplexes are formed, the hybridized RNA is specifically cleaved by RNAse H. The result is that the target RNA is cleaved into a 3xe2x80x2-end poly-A mRNA strand, and usually just one other, longer fragment. Double-stranded cDNA is then synthesized, from an oligo(dT) primer, complementary to the poly-A mRNA sequence, and containing a first upstream restriction endonuclease site, preferably an AscI site, which is useful for subsequently specifically cloning non-target genes. Adapters with a second restriction endonuclease site are then ligated to both ends of all the cDNA molecules. The cDNAs are then digested with both first and second restriction endonucleases and ligated into a suitable vector having corresponding restriction endonuclease sites.
The Killer Primer mode has the following steps. Highly and/or moderately expressed genes (xe2x80x9ctarget genesxe2x80x9d) are selected by known expression patterns (such as expressed sequence tag (xe2x80x9cESTxe2x80x9d) databases) in the tissue of interest, or by constructing a mini cDNA library and sequencing a random selection of, for instance, about 500 clones, or both. xe2x80x9cKiller Primersxe2x80x9d are then designed and synthesized. These Killer Primers are oligonucleotides complementary to the 3xe2x80x2 ends of the mRNA encoding the target genes (xe2x80x9ctarget mRNAxe2x80x9d). The mRNA is exposed to an excess of Killer Primers under conditions sufficient to form heteroduplexes specifically with the Killer Primers and the target mRNA. A killing reaction is then performed where RNAse H creates sequence-specific nicks in the heteroduplexed target mRNA.
First and second strand cDNA synthesis is then performed by reverse transcription of the mRNA. Non-target mRNA is specifically primed with oligo(dT) primers containing a first restriction endonuclease recognition site. Optionally, tests are performed to test the efficiency of killing and xe2x80x9ccross-killingxe2x80x9d (i.e. killer-primer priming of cDNA synthesis on non-target mRNA) by PCR. The cDNA is then ligated to adapters containing a second restriction endonuclease site (EcoRI or NotI are preferred restriction endonucleases). The ligation products are size-fractionated to remove fragments corresponding to 3xe2x80x2 ends of target cDNA. cDNAs are digested with the first and second restriction endonucleases and specifically cloned into a vector appropriately digested with the first and second restriction endonucleases.
The Killer Primer mode of the method allows specific cloning of the non-target cDNA exclusively, while the target genes will not get inserted into the plasmid because they lack the first restriction endonuclease site and therefore are flanked by only the second restriction endonuclease site. Thus only the non-target cDNA is cloned.
In another embodiment, the invention involves the following steps. Synthesizing short first strand cDNA extensions on RNA of interest using random primers or oligo(dT) primers or anchored oligo(dT) primers. Digesting the RNA molecules and purifying short first strand cDNA. Mixing the purified first strand cDNA with the same RNA source (for normalization) or mixing the purified first strand cDNA with a different RNA source of interest (for subtraction). Adding RNAse H to the mixture of steps obtained and incubating the mixture at a suitable temperature such as 37xc2x0 C. or 42xc2x0 C. Denaturing the DNA-RNA duplex at a suitable temperature, preferably 70xc2x0 C.; optionally repeating the process steps of incubation and denaturation without adding RNAse H in each cycle; and synthesizing cDNA using standard methods, followed by cloning. In one preferred embodiment, the RNAse H is an E. coli RNAse H. In another preferred embodiment, the RNAse H is a thermostable RNAse H such as Hybridase(trademark).
This embodiment, when used for normalization, allows preferential elimination of highly expressed genes through cycling of the killing reaction. Thus, the lower the copy number of the mRNA, the more frequently it will be cloned. When used for subtraction, this embodiment allows enrichment of differentially expressed genes by degrading other mRNAs through killing reactions.
It is noted that in this disclosure, terms such as xe2x80x9ccomprisesxe2x80x9d, xe2x80x9ccomprisedxe2x80x9d, xe2x80x9ccomprisingxe2x80x9d and the like can have the meaning attributed to it in U.S. Patent law; e.g., they can mean xe2x80x9cincludesxe2x80x9d, xe2x80x9cincludedxe2x80x9d, xe2x80x9cincludingxe2x80x9d and the like. These and other embodiments are disclosed or encompassed by the following.