cDNA and cDNA Libraries
In examining the structure and physiology of an organism, tissue or cell, it is often desirable to determine its genetic content. The genetic framework of an organism is encoded in the double-stranded sequence of nucleotide bases in the deoxyribonucleic acid (DNA) which is contained in the somatic and germ cells of the organism. The genetic content of a particular segment of DNA, or gene, is only manifested upon production of the protein which the gene encodes. In order to produce a protein, a complementary copy of one strand of the DNA double helix (the “coding” strand) is produced by polymerase enzymes, resulting in a specific sequence of ribonucleic acid (RNA). This particular type of RNA, since it contains the genetic message from the DNA for production of a protein, is called messenger RNA (mRNA).
Within a given cell, tissue or organism, there exist myriad mRNA species, each encoding a separate and specific protein. This fact provides a powerful tool to investigators interested in studying genetic expression in a tissue or cell—mRNA molecules may be isolated and further manipulated by various molecular biological techniques, thereby allowing the elucidation of the full functional genetic content of a cell, tissue or organism.
One common approach to the study of gene expression is the production of complementary DNA (cDNA) clones. In this technique, the mRNA molecules from an organism are isolated from an extract of the cells or tissues of the organism. This isolation often employs solid chromatography matrices, such as cellulose or agarose, to which oligomers of thymidine (T) have been complexed. Since the 3′ termini on most eukaryotic mRNA molecules contain a string of adenosine (A) bases, and since A binds to T, the mRNA molecules can be rapidly purified from other molecules and substances in the tissue or cell extract. From these purified mRNA molecules, cDNA copies may be made using the enzyme reverse transcriptase (RT), which results in the production of single-stranded cDNA molecules. The single-stranded cDNAs may then be converted into a complete double-stranded DNA copy (i.e., a double-stranded cDNA) of the original mRNA (and thus of the original double-stranded DNA sequence, encoding this mRNA, contained in the genome of the organism) by the action of a DNA polymerase. The protein-specific double-stranded cDNAs can then be inserted into a plasmid or viral vector, which is then introduced into a host bacterial, yeast, animal or plant cell. The host cells are then grown in culture media, resulting in a population of host cells containing (or in many cases, expressing) the gene of interest.
This entire process, from isolation of mRNA to insertion of the cDNA into a plasmid or vector to growth of host cell populations containing the isolated gene, is termed “cDNA cloning.” If cDNAs are prepared from a number of different mRNAs, the resulting set of cDNAs is called a “cDNA library,” an appropriate term since the set of cDNAs represents a “population” of genes comprising the functional genetic information present in the source cell, tissue or organism. Genotypic analysis of these cDNA libraries can yield much information on the structure and function of the organisms from which they were derived.
Retroviral Reverse Transcriptase Enzymes
Three prototypical forms of retroviral RT have been studied thoroughly. Moloney Murine Leukemia Virus (M-MLV) RT contains a single subunit of 78 kDa with RNA-dependent DNA polymerase and RNase H activity. This enzyme has been cloned and expressed in a fully active form in E. coli (reviewed in Prasad, V. R., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press, p. 135 (1993)). Human Immunodeficiency Virus (HIV) RT is a heterodimer of p66 and p51 subunits in which the smaller subunit is derived from the larger by proteolytic cleavage. The p66 subunit has both a RNA-dependent DNA polymerase and an RNase H domain, while the p51 subunit has only a DNA polymerase domain. Active HIV p66/p51 RT has been cloned and expressed successfully in a number of expression hosts, including E. coli (reviewed in Le Grice, S. F. J., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory press, p. 163 (1993)). Within the HIV p66/p51heterodimer, the 51-kD subunit is catalytically inactive, and the 66-kD subunit has both DNA polymerase and RNase H activity (Le Grice, S. F. J., et al., EMBO Journal 10:3905 (1991); Hostomsky, Z., et al., J. Virol. 66:3179 (1992)). Avian Sarcoma-Leukosis Virus (ASLV) RT, which includes but is not limited to Rous Sarcoma Virus (RSV) RT, Avian Myeloblastosis Virus (AMV) RT, Avian Erythroblastosis Virus (AEV) Helper Virus MCAV RT, Avian Myelocytomatosis Virus MC29 Helper Virus MCAV RT, Avian Reticuloendotheliosis Virus (REV-T) Helper Virus REV-A RT, Avian Sarcoma Virus UR2 Helper Virus UR2AV RT, Avian Sarcoma Virus Y73 Helper Virus YAV RT, Rous Associated Virus (RAV) RT, and Myeloblastosis Associated Virus (MAV) RT, is also a heterodimer of two subunits, α (approximately 62 kDa) and β (approximately 94 kDa), in which α is derived from β by proteolytic cleavage (reviewed in Prasad, V. R., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press (1993), p. 135). ASLV RT can exist in two additional catalytically active structural forms, ββ and α (Hizi, A. and Joklik, W. K., J. Biol. Chem. 252: 2281 (1977)). Sedimentation analysis suggests αβ and ββ are dimers and that the α form exists in an equilibrium between monomeric and dimeric forms (Grandgenett, D. P., et al., Proc. Nat. Acad. Sci. USA 70: 230 (1973); Hizi, A. and Joklik, W. K., J. Biol. Chem. 252: 2281 (1977); and Soltis, D. A. and Skalka, A. M., Proc. Nat. Acad. Sci. USA 85: 3372 (1988)). The ASLV αβ and ββ RTs are the only known examples of retroviral RT that include three different activities in the same protein complex: DNA polymerase, RNase H, and DNA endonuclease (integrase) activities (reviewed in Skalka, A. M., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press (1993), p. 193). The α form lacks the integrase domain and activity.
Various forms of the individual subunits of ASLV RT have been cloned and expressed. These include a 98-kDa precursor polypeptide that is normally processed proteolytically to β α, and a 4-kDa polypeptide removed from the β carboxy end (Alexander, F., et al., J. Virol. 61: 534 (1987) and Anderson, D. et al., Focus 17:53 (1995)), and the mature β subunit (Weis, J. H. and Salstrom, J. S., U.S. Pat. No. 4,663,290 (1987); and Soltis, D. A. and Skalka, A. M., Proc. Nat. Acad. Sci. USA 85:3372 (1988)). Heterodimeric RSV αβ RT has also been purified from E. coli cells expressing a cloned RSV β gene (Chernov, A. P., et al., Biomed. Sci. 2:49 (1991)). See also published PCT application WO 98/47912.
Labeling Nucleic Acid Molecules
As noted above, the conversion of mRNA to cDNA by RT-mediated reverse transcription is an essential step in the study of proteins expressed from cloned genes. Reverse transcription of nucleic acid molecules, particularly mRNA, to make labeled nucleic acid molecules (e.g., labeled cDNA) is also important in the generation of labeled probes for use in detection and diagnostics. Typically, fluorescent labels are used in the generation of such probes. To date, SuperScript™ II (an RNase H minus derivative of MMLV RT available from Life Technologies, Inc.) has been used in the generation of fluorescently labeled probes from mRNA templates (DeRisi et al., Science 278:680–686 (1997)). However, the incorporation rate of fluorescent nucleotides during synthesis is relatively low (less than 2%), perhaps due to the inability of MMLV RT to effectively use fluorescently labeled nucleotides as substrates during nucleic acid synthesis. Accordingly, there exists a need for more efficient incorporation of labeled nucleotides, particularly fluorescently labeled nucleotides, during reverse transcription of a nucleic acid template. Efficient incorporation of such nucleotides will allow for improved synthesis of labeled probes which may be used in the research market as well as in the field of diagnostics.