Field of the Invention
The present invention is in the fields of molecular and cellular biology. The invention is generally related to reverse transcriptase enzymes and methods for the reverse transcription of nucleic acid molecules, especially messenger RNA molecules. Specifically, the invention relates to reverse transcriptase enzymes which have been mutated or modified to increase thermostability, decrease terminal deoxynucleotidyl transferase activity, and/or increase fidelity, and to methods of producing, amplifying or sequencing nucleic acid molecules (particularly cDNA molecules) using these reverse transcriptase enzymes or compositions. The invention also relates to nucleic acid molecules produced by these methods and to the use of such nucleic acid molecules to produce desired polypeptides. The invention also relates to nucleic acid molecules encoding the reverse transcriptases of the invention, to vectors containing such nucleic acid molecules, and to host cells containing such nucleic acid molecules. The invention also concerns kits or compositions comprising such enzymes.
Related Art
cDNA and cDNA Libraries
In examining the structure and physiology of an organism, tissue or cell, it is often desirable to determine its genetic content. The genetic framework of an organism is encoded in the double-stranded sequence of nucleotide bases in the deoxyribonucleic acid (DNA) which is contained in the somatic and germ cells of the organism. The genetic content of a particular segment of DNA, or gene, is typically manifested upon production of the protein which the gene encodes. In order to produce a protein, a complementary copy of one strand of the DNA double helix is produced by RNA polymerase enzymes, resulting in a specific sequence of ribonucleic acid (RNA). This particular type of RNA, since it contains the genetic message from the DNA for production of a protein, is called messenger RNA (mRNA).
Within a given cell, tissue or organism, there exist myriad mRNA species, each encoding a separate and specific protein. This fact provides a powerful tool to investigators interested in studying genetic expression in a tissue or cell. mRNA molecules may be isolated and further manipulated by various molecular biological techniques, thereby allowing the elucidation of the full functional genetic content of a cell, tissue or organism.
One common approach to the study of gene expression is the production of complementary DNA (cDNA) clones. In this technique, the mRNA molecules from an organism are isolated from an extract of the cells or tissues of the organism. This isolation often employs solid chromatography matrices, such as cellulose or agarose, to which oligomers of thymidine (T) have been complexed. Since the 3′ termini on most eukaryotic mRNA molecules contain a string of adenosine (A) bases, and since A base pairs with T, the mRNA molecules can be rapidly purified from other molecules and substances in the tissue or cell extract. From these purified mRNA molecules, cDNA copies may be made using the enzyme reverse transcriptase (RT), which results in the production of single-stranded cDNA molecules. This reaction is typically referred to as the first strand reaction. The single-stranded cDNAs may then be converted into a complete double-stranded DNA copy (i.e., a double-stranded cDNA) of the original mRNA (and thus of the original double-stranded DNA sequence, encoding this mRNA, contained in the genome of the organism) by the action of a DNA polymerase. The protein-specific double-stranded cDNAs can then be inserted into a plasmid or viral vector, which is then introduced into a host bacterial, yeast, animal or plant cell. The host cells are then grown in culture media, resulting in a population of host cells containing (or in many cases, expressing) the gene of interest.
This entire process, from isolation of mRNA from a source organism or tissue to insertion of the cDNA into a plasmid or vector to growth of host cell populations containing the isolated gene, is termed “cDNA cloning.” The set of cDNAs prepared from a given source of mRNAs is called a “cDNA library.” The cDNA clones in a cDNA library correspond to the genes transcribed in the source tissue. Analysis of a cDNA library can yield much information on the pattern of gene expression in the organism or tissue from which it was derived.
Retroviral Reverse Transcriptase Enzymes
Three prototypical forms of retroviral reverse transcriptase have been studied thoroughly. Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase contains a single subunit of 78 kDa with RNA-dependent DNA polymerase and RNase H activity. This enzyme has been cloned and expressed in a fully active form in E. coli (reviewed in Prasad, V. R., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press, p. 135 (1993)). Human Immunodeficiency Virus (HIV) reverse transcriptase is a heterodimer of p66 and p51 subunits in which the smaller subunit is derived from the larger by proteolytic cleavage. The p66 subunit has both a RNA-dependent DNA polymerase and an RNase H domain, while the p51 subunit has only a DNA polymerase domain. Active HIV p66/p51 reverse transcriptase has been cloned and expressed successfully in a number of expression hosts, including E. coli (reviewed in Le Grice, S. F. J., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory press, p. 163 (1993)). Within the HIV p66/p51 heterodimer, the 51-kD subunit is catalytically inactive, and the 66-kD subunit has both DNA polymerase and RNase H activity (Le Grice, S. F. J., et al., EMBO Journal 10:3905 (1991); Hostomsky, Z., et al., J. Virol. 66:3179 (1992)). Avian Sarcoma-Leukosis Virus (ASLV) reverse transcriptase, which includes but is not limited to Rous Sarcoma Virus (RSV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, Avian Erythroblastosis Virus (AEV) Helper Virus MCAV reverse transcriptase, Avian Myelocytomatosis Virus MC29 Helper Virus MCAV reverse transcriptase, Avian Reticuloendotheliosis Virus (REV-T) Helper Virus REV-A reverse transcriptase, Avian Sarcoma Virus UR2 Helper Virus UR2AV reverse transcriptase, Avian Sarcoma Virus Y73 Helper Virus YAV reverse transcriptase, Rous Associated Virus (RAV) reverse transcriptase, and Myeloblastosis Associated Virus (MAV) reverse transcriptase, is also a heterodimer of two subunits, α (approximately 62 kDa) and β (approximately 94 kDa), in which α is derived from β by proteolytic cleavage (reviewed in Prasad, V. R., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press (1993), p. 135). ASLV reverse transcriptase can exist in two additional catalytically active structural forms, ββ and α (Hizi, A. and Joklik, W. K., J. Biol. Chem. 252: 2281 (1977)). Sedimentation analysis suggests αβ and ββ are dimers and that the α form exists in an equilibrium between monomeric and dimeric forms (Grandgenett, D. P., et al., Proc. Nat. Acad. Sci. USA 70:230 (1973); Hizi, A. and Joklik, W. K., J. Biol. Chem. 252:2281 (1977); and Soltis, D. A. and Skalka, A. M., Proc. Nat. Acad. Sci. USA 85:3372 (1988)). The ASLV αβ and ββ reverse transcriptases are the only known examples of retroviral reverse transcriptase that include three different activities in the same protein complex: DNA polymerase, RNase H, and DNA endonuclease (integrase) activities (reviewed in Skalka, A. M., Reverse Transcriptase, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press (1993), p. 193). The α form lacks the integrase domain and activity.
Various forms of the individual subunits of ASLV reverse transcriptase have been cloned and expressed. These include a 98-kDa precursor polypeptide that is normally processed proteolytically to β and a 4 kDa polypeptide removed from the β carboxy end (Alexander, F., et al., J. Virol. 61:534 (1987) and Anderson, D. et al., Focus 17:53 (1995)), and the mature β subunit (Weis, J. H. and Salstrom, J. S., U.S. Pat. No. 4,663,290 (1987); and Soltis, D. A. and Skalka, A. M., Proc. Nat. Acad. Sci. USA 85:3372 (1988)). (See also Werner S, and Wohrl B. M., Eur. J. Biochem. 267:4740-4744 (2000); Werner S, and Wohrl B. M., J. Virol. 74:3245-3252 (2000); Werner S. and Wohrl B. M., J. Biol. Chem. 274:26329-26336 (1999).) Heterodimeric RSV αβ reverse transcriptase has also been purified from E. coli cells expressing a cloned RSV β gene (Chemov, A. P., et al., Biomed. Sci. 2:49 (1991)).
Reverse Transcription Efficiency
As noted above, the conversion of mRNA into cDNA by reverse transcriptase-mediated reverse transcription is an essential step in the study of proteins expressed from cloned genes. However, the use of unmodified reverse transcriptase to catalyze reverse transcription is inefficient for a number of reasons. First, reverse transcriptase sometimes degrades an RNA template before the first strand reaction is initiated or completed, primarily due to the intrinsic RNase H activity present in reverse transcriptase. In addition, mis-priming of the mRNA template molecule can lead to the introduction of errors in the cDNA first strand while secondary structure of the mRNA molecule itself may make some mRNAs refractory to first strand synthesis.
Removal of the RNase H activity of reverse transcriptase can eliminate the first problem and improve the efficiency of reverse transcription (Gerard, G. F., et al., FOCUS 11(4):60 (1989); Gerard, G. F., et al., FOCUS 14(3):91 (1992)). However such reverse transcriptases (“RNase H−” forms) do not address the additional problems of mis-priming and mRNA secondary structure.
Another factor which influences the efficiency of reverse transcription is the ability of RNA to form secondary structures. Such secondary structures can form, for example, when regions of RNA molecules have sufficient complementarity to hybridize and form double stranded RNA. Generally, the formation of RNA secondary structures can be reduced by raising the temperature of solutions which contain the RNA molecules. Thus, in many instances, it is desirable to reverse transcribe RNA at temperatures above 37° C. However, art known reverse transcriptases generally lose activity when incubated at temperatures much above 37° C. (e.g., 50° C.).