1. Field of the Invention
This invention is in the fields of molecular and cellular biology. The invention is generally directed to amplification of nucleic acid molecules and to methods for cloning nucleic acid molecules (DNA or RNA) that have been amplified or synthesized, particularly those nucleic acid molecules that have undergone PCR amplification. In particular, the invention concerns methods of cloning amplified nucleic acid molecules comprising the use of inhibitors of nucleic acid polymerases that carry out the amplification. The invention further concerns nucleic acid molecules produced by such methods and vectors and host cells comprising such nucleic acid molecules. The invention further relates to compositions for facilitating cloning of amplified nucleic acid molecules.
2. Related Art
Cloning of Nucleic Acid Molecules
In examining the structure and physiology of an organism, tissue or cell, it is often desirable to determine its genetic content. The genetic framework of an organism is encoded in the double-stranded sequence of nucleotide bases in the deoxyribonucleic acid (DNA) which is contained in the somatic and germ cells of the organism. The genetic content of a particular segment of DNA, or gene, is only manifested upon production of the protein which the gene encodes. In order to produce a protein, a complementary copy of one strand of the DNA double helix (the "coding" strand) is produced by polymerase enzymes, resulting in a specific sequence of ribonucleic acid (RNA). This particular type of RNA, since it contains the genetic message from the DNA for production of a protein, is called messenger RNA (mRNA).
Within a given cell, tissue or organism, there exist myriad mRNA species, each encoding a separate and specific protein. This fact provides a powerful tool to investigators interested in studying genetic expression in a tissue or cell--mRNA molecules may be isolated and further manipulated by various molecular biological techniques, thereby allowing the elucidation of the full functional genetic content of a cell, tissue or organism.
One common approach to the study of gene expression is the production of complementary DNA (cDNA) clones. In this technique, the mRNA molecules from an organism are isolated from an extract of the cells or tissues of the organism. This isolation often employs solid chromatography matrices, such as cellulose or agarose, to which oligomers of thymidine (T) have been complexed. Since the 3' termini on most eukaryotic mRNA molecules contain a string of adenosine (A) bases, and since A binds to T, the mRNA molecules can be rapidly purified from other molecules and substances in the tissue or cell extract. From these purified mRNA molecules, cDNA copies may be made using one or more polypeptides having reverse transcriptase (RT) activity, which results in the production of single-stranded cDNA molecules. The single-stranded cDNAs may then be converted into a complete double-stranded DNA copy (i.e., a double-stranded cDNA) of the original mRNA (and thus of the original double-stranded DNA sequence, encoding this mRNA, contained in the genome of the organism) by the action of a polypeptide having nucleic acid polymerase activity, such as a DNA polymerase. The protein-specific double-stranded cDNAs can then be inserted into a plasmid or viral vector (also called cloning vehicles), using controlled restriction enzyme digestion and ligation of the cDNA and the vehicle. The resulting cDNA-vehicle construct is then introduced into a bacterial host, yeast, animal or plant cell and the host cells are then grown in culture media, resulting in a population of host cells containing (or in some cases, expressing) the gene of interest.
This entire process, from isolation of mRNA to insertion of the cDNA into a plasmid or vector to growth of host cell populations containing the isolated gene, is termed "cDNA cloning." If cDNAs are prepared from a number of different mRNAs, the resulting set of cDNAs is called a "cDNA library" which represents a population of genes comprising the functional genetic information present in the source cell, tissue or organism.
A variety of procedures are useful to clone genes. One such method entails analyzing a library of cDNA inserts (derived from a cell expressing the corresponding protein) for the presence of an insert which contains the desired gene. Such an analysis may be conducted by transfecting cells with the vector, inducing the expression of the protein, and then assaying for protein expression, for example, by immunoreaction with an antibody which is specific for the desired protein.
Alternatively, in order to detect the presence of the desired gene, one may employ an oligonucleotide (or set of oligonucleotides) which have a nucleotide sequence that is complementary to the oligonucleotide sequence or set of sequences that codes for the desired protein. Such oligonucleotides are used to detect and/or isolate the desired gene by selective hybridization. Techniques of nucleic acid hybridization are disclosed by Maniatis, T., et al., In: Molecular Cloning, a Laboratory Manual, Cold Spring Harbor, N.Y. (1982), and by Haymes, B. D., et al., In: Nucleic Acid Hybridization, a Practical Approach, IRL Press, Washington, D.C. (1985), which references are herein incorporated by reference.
In addition to the above methods, most commonly used cloning vectors have an indicator gene which results in the expression of a specific phenotype in host cells containing the vector (e.g., blue colonies for host cells containing vectors that carry lacZ.alpha.; see Maniatis, T., et al., Id.). Insertion of heterologous nucleic acid sequences into multiple cloning sites in such vectors interrupts or inactivates the indicator gene, resulting in non-expression of the phenotype (e.g., white colonies for the above-described host cells containing lacZ.alpha. vectors). Such an approach provides a convenient means for differentiating recombinant clones (i.e., those forming white colonies) from non-recombinant clones (i.e., those forming blue colonies). However, this approach does not prevent the growth of non-recombinant clones.
Nucleic Acid Amplification
Soon after their identification and characterization, it was recognized that the activities of the various enzymes and cofactors involved in nucleic acid synthesis could be exploited in vitro to dramatically increase the concentration of, or "amplify," one or more selected nucleotide sequences. For many medical, diagnostic and forensic applications, amplification of a particular nucleic acid molecule is essential to allow its detection in, or isolation from, a sample in which it is present in very low amounts. More recently, in vitro amplification of specific genes has provided powerful and less costly means to facilitate the production of therapeutic proteins by molecular biological techniques, and may have applications in genetic therapy as well.
While a variety of nucleic acid amplification processes have been described, the most commonly employed is the Polymerase Chain Reaction (PCR) technique disclosed in U.S. Pat. Nos. 4,683,195 and 4,683,202. In this process, a sample containing the nucleic acid sequence to be amplified (the "target sequence") is first heated to denature or separate the two strands of the nucleic acid. The sample is then cooled and mixed with specific oligonucleotide primers which hybridize to the target sequence. Following this hybridization, a buffered aqueous solution containing at least one polypeptide having DNA polymerase activity is added to the sample, along with a mixture of the dNTPs that are linked by the polymerase to the replicating nucleic acid strand. After allowing polymerization to proceed to completion, the products are again heat-denatured, subjected to another round of primer hybridization and polymerase replication, and this process is repeated any number of times. Since each nucleic acid product of a given cycle of this process serves as a template for production of two new nucleic acid molecules (one from each parent strand), the PCR process results in an exponential increase in the concentration of the target sequence. Thus, in a well-controlled, high-fidelity PCR process, as few as 20 cycles can result in an over one million-fold amplification of the target nucleic acid sequence (See U.S. Pat. Nos. 4,683,195 and 4,683,202).
Other techniques for amplification of target nucleic acid sequences have also been developed. For example, Walker et al. (U.S. Pat. No. 5,455,166; EP 0 684 315) described a method called Strand Displacement Amplification (SDA), which differs from PCR in that it operates at a single temperature and uses a polymerase/endonuclease combination of enzymes to generate single-stranded fragments of the target DNA sequence, which then serve as templates for the production of complementary DNA (cDNA) strands. An alternative amplification procedure, termed Nucleic Acid Sequence-Based Amplification (NASBA) was disclosed by Davey et al. (U.S. Pat. No. 5,409,818; EP 0 329 822). Similar to SDA, NASBA employs an isothermal reaction, but is based on the use of RNA primers for amplification rather than DNA primers as in PCR or SDA.
Amplification-Based Cloning
Standard cloning techniques such as those described above are often useful for cloning nucleic acid sequences that are expressed at relatively high levels in the source cells or tissues. However, these techniques frequently are not particularly sensitive when the starting samples contain only low levels of the nucleic acid molecule of interest. This problem is particularly important when the tissue or cell samples are themselves present in low quantities (as in many medical or forensic applications), or when the specific nucleotide sequence is present or expressed at low levels in the cell/tissue samples.
Amplification-based cloning of nucleic acid molecules, particularly that employing PCR, has been used in the attempt to overcome the lack of sensitivity of earlier approaches (see, e.g., Lee, C. C., et al., Science 239:1288-1291 (1988)). There are a number of methods available for performing such cloning.
In one such method, restriction enzyme sites can be incorporated into the PCR primers; the PCR-generated nucleic acid molecules will thus contain these restriction sites. For cloning of these specific sequences, these amplified nucleic acid molecules can then be digested with restriction enzymes, the digested fragments ligated into an appropriate site within a plasmid vector, and the vector incorporated into a host cell.
Alternatively, PCR products generated by Taq DNA polymerase, which typically contain an additional deoxyadenosine (dA) residue at their 3' termini, can be cloned into specific cloning vectors containing 3' deoxythymidine (dT) overhangs which provide a specific recognition sequence for the 3' A residue on the PCR product. This process, often referred to as "TA cloning," provides a means of directly cloning PCR-amplified nucleic acid molecules without the need for preparation of primers with specific restriction sites (see U.S. Pat. No. 5,487,993, which is incorporated herein by reference in its entirety).
In other cloning methods, blunt-end PCR fragments generated by cleavage with certain restriction enzymes (e.g., SmaI, SspI or ScaI) can be cloned into blunt-end insertion sites of cloning vectors (see, e.g., Ausubel, F. M., et al., eds., "Current Protocols in Molecular Biology," New York: John Wiley & Sons, Inc., pp. 3.16.1-3.16.11 (1995)), or PCR-amplified nucleic acid molecules can be cloned using uracil DNA glycosylase (UDG, see U.S. Pat. No. 5,137,814, which is incorporated herein by reference in its entirety). Such blunt-end cloning may also be facilitated by treatment of Taq-amplified PCR products, which contain dA overhangs as described above, with T4 DNA polymerase to remove the dA overhangs (a procedure often termed "polishing") followed by insertion of the resulting blunt-end fragments into blunt-end vector insertion sites as generally described above.
However, the cloning of amplified nucleic acid molecules, especially by restriction enzyme digestion and insertion into cloning vehicles, is usually not simple and straightforward. Problems that plague the investigator are low cloning efficiencies (i.e., a low number of recombinant clones obtained per transformation) and cloning artifacts (i.e., recombinant clones which contain a modified insert). The probable cause of such technical limitations is residual polymerase activity which remains in the reaction mixture after the amplification process (see Bennet, B. L., and Molenaar, A. J., BioTechniques 16:36-37 (1994)). In fact, it has been shown that after 30 rounds of amplification under standard PCR conditions, sufficient residual polymerase activity is present in the reaction mixture to conduct an additional 30 rounds of amplification. Upon digestion of the termini of the amplified nucleic acid molecules with restriction endonucleases to generate 3' recessed ("sticky") ends in the initial stages of cloning, this residual polymerase can utilize remaining dNTPs in the sample to fill in the 3' ends to regenerate an undesirable blunt end. This interference results in poor ligation of the digested insert into a prepared recipient cloning vector which has been manipulated to possess recessed ends compatible with those of the insert. In fact, even the addition of a single nucleotide to the 3' sticky end can inhibit the ligation process and increase the number of incorrect recombinants that an operator must screen. An additional complication is that if the insert is to be ligated into an expression vector for transformation into a host cell to ultimately generate a protein encoded by the insert, the addition of nucleotides to the digested amplification products can often shift the reading frame of the insert and result in expression of an incomplete, mutant and/or nonfunctional protein, especially if the promoter resides in the cloning vector 5' to the insert.
One often-used approach to attempting to solve this technical problem involves multiple organic phenol/chloroform extractions of the amplified nucleic acid molecules, prior to cloning, to aid in the removal of the residual polymerases. Analogous methods involve similar time-consuming technical manipulations such as successive rounds of ethanol precipitation and agarose gel purification. While such techniques may reduce the content of the amplifying polymerase to some extent, they also usually result in reduced yields of clonable product due to loss, destruction and/or structural alteration of the amplified nucleic acid molecules during purification. Thus, the temporal and economic constraints to efficient and high-yield cloning of amplified nucleic acid molecules have yet to be overcome.