Throughout this application, various publications are referenced by author and date within the text. Full citations for these publications may be found listed alphabetically at the end of the specification immediately preceding the claims. All patents, patent applications and publications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art as known to those skilled therein as of the date of the invention described and claimed herein.
Sequencing of the ends of a large number of cDNA clones (ESTs, expressed sequence tags) is a parallel effort to genome sequencing. Genes predicted in genomic sequences are verified only if they are expressed in the form of RNA. These RNAs are represented as clones in cDNA libraries. Accelerating the discovery of new ESTs may greatly expedite the identification and cloning of human disease genes. Many ESTs are not full length; however, they provide important information for identification of expressed genes, verification of the exon-intron boundaries predicted from genome sequences, and detection of alternative splicing.
The copy number of mRNAs in a eukaryotic cell can differ by factors of thousands (Lewin 1994). The problem of detecting genes expressed at low levels is illustrated in the simple animal model C. elegans, which has a small genome of 108 base pairs predicted to encode only 19,000 genes (Thierry-Mieg et al. 1999). Forty thousand (40,000) cDNA clones from a large normalized cDNA library of C. elegans were sequenced to detect the first 7,400 expressed genes. The average rate of detecting genes was 5.4 clones per gene (The C. elegans Sequencing Consortium 1998). Twenty-five thousand (25,000) clones were sequenced to detect the next 1,700 expressed genes, an average rate of 14.7 clones per gene (Thierry-Mieg et al. 1999, Kohara et al. 1999).
To detect genes expressed at lower levels than the genes already found, there is a need for better cDNA libraries. With the dramatically accelerated sequencing of the human genome, high priority should be placed on developing technology for finding rare ESTs. This raises the question of how to search for recombinant clones corresponding to rare mRNA species. Clearly, identification of such clones requires screening of large (complex) cDNA libraries. For the purpose of gene discovery, it would be attractive to construct cDNA libraries containing nearly equal amounts of cDNA from each expressed gene. Contents of different double-stranded DNAs have been equalized from mixtures of abundant and rare restriction fragments as a model system (Puzyrev et al. 1990). Since reassociation kinetics of denatured double-stranded DNAs obey the second-order equation Vi=Ki (ssDNAi)2, the concentrations of different unhybridized DNA molecules can become nearly equal after partial reassociation. The unhybridized single-stranded DNAs can be separated from hybridized double-stranded molecules by hydroxyapatite chromatography and can be cloned after conversion into double-stranded DNA. This principle has been used previously for preparation of normalized cDNA libraries.
Methods to Normalize cDNA Libraries
Five groups have developed methods to normalize cDNA libraries based on the kinetic approach (Ko, 1990; Patanjali et al., 1991; Sasaki et al., 1994; Soares et al., 1994; Puzyrev et al., 1995; Soares et al., 1996). Ko (1990) described the construction of a normalized cDNA library by a method involving: (a) ligation of a linker-primer adaptor to cDNAs; (b) three rounds of PCR amplification, denaturation and partial reassociation; (c) separation of single-stranded cDNAs from double-stranded cDNAs by hydroxyapatite chromatography; (d) conversion of single-stranded cDNAs into double-stranded cDNAs; (e) digestion of the end product using a site present in the linker-primer sequence; and (f) ligation into a vector for cloning. Colony hybridization with eight probes showed a reduction in xe2x80x9cabundance variationxe2x80x9d after three cycles of normalization. The concentration of some abundant clones decreased up to 34-fold and the concentration of some rare clones increased up to 2.6-fold. However, for some abundant clones the extent of normalization was low, and the concentration of one abundant clone even increased in the normalized library.
Patanjali et al. (1991) constructed a normalized cDNA library by a method similar to that of Ko, involving: (a) synthesis of double-stranded cDNAs by random priming and cloning them in a vector; (b) amplification of the cloned cDNAs by PCR; (c) denaturation and partial reassociation; (d) separation of single-stranded cDNAs from double-stranded cDNAs by hydroxyapatite chromatography; (e) amplification of single-stranded cDNAs by PCR; (f) ligating the products into a vector for cloning. Analysis with 10 probes showed an extreme decrease of concentration for a very abundantly expressed ribosomal gene (30% in the unnormalized library, and 2,500 times less in the normalized library), and the concentration of some rare clones increased up to 3-fold. As in Ko""s method, the extent of normalization was lower than expected for some clones.
Puzyrev et al. (1995) reported the construction of a normalized cDNA library by a similar method, which involved: (a) amplification of cloned cDNAs by PCR; (b) denaturation and partial reassociation of amplified cDNAs in the presence of excess xe2x80x9ccompetitorsxe2x80x9dxe2x80x94sequences common to all cDNAs; (c) separation of single-stranded cDNAs from double-stranded cDNAs by hydroxyapatite chromatography; (d) amplification of single-stranded cDNAs by PCR; and (e) cloning these cDNAs into lambda gt11. Analysis with ten probes showed that abundant clones were reduced 3-20 fold, but the less abundant clones tested were not greatly enriched.
Sasaki et al. (1994) constructed a normalized cDNA library by a procedure which involved: (a) synthesis of first-rate cDNA; (b) binding the cDNA to a matrix; (c) sequential cycles of hybridizing the matrix-bound cDNA with a corresponding whole mRNA population and eluting unhybridized mRNA; and (e) constructing a normalized cDNA library with use of mRNA eluted in step (c). Analysis with 7 probes showed a 100-fold decrease in concentration for an abundant xcex2-globin clone, and an increase up to 6-fold for four rare clones.
Soares et al. (1994) described the construction of a normalized cDNA library from human infant brain by a method involving: (a) construction of a cDNA library in a vector capable of being converted to single-stranded circles, and capable of producing strands complementary to the single-stranded circles; (b) converting the cDNA library to single-stranded circles; (c) generating strands complementary to the single-stranded circles; (d) hybridizing the single-stranded circles converted in step (b) with complementary strands of step (c) to produce partial duplexes; (e) separating the unhybridized single-stranded circles from the hybridized circles by hydroxyapatite chromatography; (f) conversion of the unhybridized single-stranded circles into partial duplexes; and (g) electroporation into E. coli. Normalization was achieved with this method for most cDNA species examined. The concentration of xe2x80x9crare cDNAsxe2x80x9d were increased in normalized libraries by 2-30 times; however, the concentrations of some abundant cDNAs were reduced only 3-fold.
Bonaldo et al. (1996) reported the construction of cDNA libraries from human, mouse, and rat by normalization and subtraction. Several methods were described. For example, to avoid or reduce continued isolation of known clones, subtractive hybridization was applied to reduce the representation of previously arrayed and sequenced clones from normalized libraries. Another method to improve normalization used RNA synthesized from abundant cDNAs to hybridize with single-stranded circles from the starting library. The subtractive method involved: (a) construction of a cDNA library using a vector that permits both transcription of the cDNA inserts and conversion of the cDNA library to single-stranded circles; (b) transcription of the cDNA inserts in vitro; (c) purification of the single-stranded circles by hydroxyapatite chromatography; (d) in the presence of blocking oligonucleotides, hybridizing the single-stranded circles prepared in step (c) at an appropriate Cot with complementary RNAs from step (b) to produce partial duplexes; (e) separating the partial duplexes from the single-stranded circles by HAP-chromatography; (f) converting the partial duplexes to complete double-stranded DNA circles and electroporating them into E. coli to create a mini-library enriched for abundant cDNAs; (g) transcribing the double-stranded plasmid mini-library from (f); (h) hybridizing the single-stranded circles purified in step (c) with abundant RNAs from step (g) at an appropriate Cot (i), separating unhybridized single-stranded circles from step (h) by HAP-chromatography; and (j) converting single-stranded circles from step (i) into double-stranded circles and electroporating them into E. coli to generate a subtractive cDNA library.
The present invention provides for a method for producing a cDNA library enriched for rare cDNAs and reduced in abundant cDNAs which comprises: (a) obtaining a pool of linear double-stranded cDNAs; (b) cloning a first portion of the pool of cDNAs into a first vector to create a first cDNA library; (c) cloning a second portion of the pool of cDNAs into a second vector to create a second cDNA library; (d) producing single-stranded linear cDNA inserts (target cDNA) from the first cDNA library; (e) producing single-stranded circles (target cDNA) from the second cDNA library; (f) producing a pool of abundant linear cDNAs (driver cDNA) from the first and the second DNA libraries by the following steps: (i) amplifying the cDNA inserts from the first and the second libraries by polymerase chain reaction using two pairs of appropriate primers which specifically hybridize with the first and second vectors, respectively; (ii) removing DNA sequences common to all of the amplified products from step (i); (iii) denaturing the amplified products from step (ii); (iv) partially reassociating the denatured products from step (iii) in a hybridization mixture under appropriate hybridization conditions so as to produce duplexes of abundant cDNAs, and (v) removing unreassociated cDNAs from step (iv), thereby producing the pools of abundant linear cDNAs from the first and the second cDNA libraries; (g) hybridizing the linear cDNA inserts from step (d) or the single-stranded circles from step (e) with an excess amount of the abundant cDNA pool produced from the second cDNA library or the first cDNA library, respectively, from step (v) under hybridization conditions to produce duplexes, and (h) isolating single-stranded linear cDNA inserts or single-stranded circles which remain after the hybridization of step (g), thereby producing cDNA or a cDNA library enriched for rare cDNAs and reduced in abundant cDNAs.