The characterization of cell specific gene expression finds application in a variety of disciplines, such as in the analysis of differential expression between different tissue types, different stages of cellular growth or between normal and diseased states, and the like. Fundamental to the characterization of cell specific gene expression is the detection of mRNA, and the construction of comprehensive cDNA libraries. However, the detection of mRNA is often complicated by one or more of the following factors: cell heterogeneity, paucity of material, or limits of low abundance mRNA detection.
In a general method of constructing cDNA libraries, polyA mRNA is prepared from the desired cells and the first strand of the cDNA is prepared from the polyA mRNA using a RNA-dependent DNA polymerase (“reverse transcriptase”) and an oligodeoxynucleotide primer of 12 to 18 thymidine residues. In another method, the primer contains one or two nucleotides at one end that can hybridize to the mRNA sequence upstream of the polyA tail. Usually, the first polyA-non-complementary nucleotide is a deoxyadenylate, deoxyguanylate, or deoxycytidylate (“dC”), and the second nucleotide can be any deoxynucleotide. The use of 2 nucleotides can provide a more accurate positioning of the primer at the junction between mRNA and the polyA tail.
The second strand of the cDNA is synthesized by one of several methods, the more efficient of which are commonly known as “replacement synthesis” and “primed synthesis.” Replacement synthesis involves the use of ribonuclease H (“RNAase H”), which cleaves the phosphodiester backbone of RNA that is in a RNA:DNA hybrid leaving a 3′ hydroxyl and a 5′ phosphate, to produce nicks and gaps in the mRNA strand, creating a series of RNA primers that are used by E. coli DNA polymerase I, or its “Klenow” fragment, to synthesize the second strand of the cDNA. This reaction is very efficient; however, the cDNAs produced most often lack the 5′ terminus of the mRNA sequence.
Primed synthesis to generate the second cDNA strand is a general name for several methods which are more difficult than replacement synthesis yet clone the 5′ terminal sequences with high efficiency. In general, after the synthesis of the first cDNA strand, the 3′ end of the cDNA strand is extended with terminal transferase, an enzyme which adds a homopolymeric “tail” of deoxynucleotides, most commonly deoxycytidylate. This tail is then hybridized to a primer of oligodeoxyguanidylate or a synthetic fragment of DNA with an deoxyguanidylate tail and the second strand of the cDNA is synthesized using a DNA-dependent DNA polymerase.
Once both cDNA strands have been synthesized, the cDNA library is constructed by cloning the cDNAs into an appropriate plasmid or viral vector. In practice this can be done by directly ligating the blunt ends of the cDNAs into a vector which has been digested by a restriction endonuclease to produce blunt ends. Blunt end ligations are very inefficient, however, and this is not a common method of choice. A generally used method involves adding synthetic linkers or adapters containing restriction endonuclease recognition sequences to the ends of the cDNAs. The cDNAs can then be cloned into the desired vector at a greater efficiency.
One potential problem with the current method of constructing cDNA libraries is that the hybridization of the oligo dT primer to the polyA tail of the mRNA in the initial step is not perfect. The primer does not necessarily accurately position at the junction between the mRNA and its polyA tail. Therefore, there may be continuous stretches of T's in addition to the T's on the first strand primer. While this does not usually affect efficiencies in sequencing from the 5′ end, it severly compromises the ability to obtain accurate and successful sequencing from the 3′ (polyA tail) end. Thus, there exists a need for methods and procedures of cDNA synthesis and cloning.