The development of DNA cloning techniques for complementary DNA (cDNA) copies of messenger RNA (mRNA) molecules has been of great value in the study of eukaryotic genes. In many cases, the amount of a given mRNA for which cDNA clones are desired is limited by the availability of appropriate tissue sources and/or a low concentration of that specific mRNA in those sources. Therefore, readily obtainable sources may provide only a few copies of a given mRNA molecule from which cDNA clones might be produced.
The requirements for any efficient method for cDNA cloning may be generally summarized as follows: first, full-length double-stranded cDNAs must be produced from the mRNA with high yield; the ends of the resulting DNA fragments must be made capable of being joined efficiently to the vector DNA by enzymatic ligation; production of undesirable ligation byproducts must be minimized; and, preferably, insertion of the cDNA into the vector DNA should provide expression of the cDNA to facilitate detection of the desired clone by means of the product.
Production of the protein product may be necessary for detecting a gene when no nucleic acid probes for the desired gene are available. More generally, such expression of the protein is desirable because, in terms of copy number, the protein provides a molecular signal that is greatly amplified in relation to the DNA molecules of the cloned gene inside the host cell.
As it is difficult to achieve high efficiency of conversion of mRNA molecules into full-length cDNA clones, especially when the mRNA of interest is relatively long, several refinements in cDNA cloning strategy have been made. Among them, the Okayama-Berg method significantly improved the efficiency of full-length cDNA cloning.
The Okayama-Berg approach has several advantages over previous, conventional methods for cloning cDNAs. The following section is intended to highlight these advantages in relation to the main steps of this complicated method. For a more complete and detailed description of the method, see the original publication [Okayama, H. and Berg, P. (1982) Mol. Cell. Biol. 2, 161-170], which is hereby incorporated herein by reference.
The main advantages of the Okayama-Berg method for cDNA clone relate to the fact that as part of the processing needed to form mRNAs, transcripts of eukaryotic genes undergo enzymatic addition of multiple adenosine residues at the 3' end, thereby acquiring what is known as a "poly(A) tail". In the present context, the term mRNA encompasses any RNA species from any source, natural or synthetic, having a 3' poly(A) tail comprising two or more adenosine residues.
In the original Okayama-Berg approach, synthesis of the first DNA strand from the mRNA template is initiated by annealing the 3' poly(A) of the eukaryotic mRNA to an oligo(dT) primer which forms an extension of one end of a DNA strand of the cloning plasmid. First strand cDNA synthesis by this "plasmid-priming" method directs the orientation of the sequence within the cDNA into a unique relationship with the sequence in the plasmid; hence, this approach has been called "directional" cloning. Directional cloning ensures that every cDNA clone that is formed will be correctly oriented for a promoter provided in the cloning plasmid (an SV40 promoter in the original Okayama-Berg system) to drive transcription of the proper cDNA strand to produce RNA with the correct sense for translation into the protein encoded by the original mRNA template.
To provide high efficiency of ligation in cloning DNA segments in general, restriction nucleases are utilized to produce short single-stranded ends on the DNA that are complementary in base sequence to any other DNA end produced by the same enzyme. Accordingly, these single-stranded ends can anneal together by forming specific DNA base pairs, or, in the vernacular, they are "sticky". This annealing greatly enhances the rate of joining DNA segments by enzymatic ligation and further provides a means for selectively joining ends of segments treated with the same enzyme.
In the original Okayama-Berg method, after synthesis of the first cDNA strand, an oligo(dG) tail is attached enzymatically to the free end of the plasmid-primed cDNA, and then the plasmid is cleaved by a restriction enzyme (HindIII) to produce a sticky end on the plasmid opposite to the end where the cDNA is attached. A short DNA fragment ("linker"), which contains the SV40 promoter and has a cleaved HindIII site on one end and oligo(dC) on the other, is then attached to the cDNA-plasmid molecule by ligation, to circularize the molecule.
In other, more conventional methods a (synthetic) linker may also be used to clone cDNAs, but it is attached after second strand DNA synthesis and further enzymatic repair which is necessary to form perfectly matched strands (i.e., a "flush" or "blunt" end). To protect internal restriction sites of the double-stranded cDNA from cleavage by the restriction enzyme required to allow ligation of the vector and linker, prior to addition of the linker, the cDNA is methylated with the appropriate DNA modification system associated with the given restriction enzyme. However, such protection may not be absolute; thus, internal sites may be cleaved at some frequency due to an incomplete methylation reaction. In contrast, in the original Okayama-Berg method, this problem of internal cleavage of cDNAs is obviated by cleavage of HindIII sites on the vector when the cDNA is represented as an RNA:DNA hybrid that resists restriction.
The Okayama-Berg approach provides yet another advantage over previous methods in which both ends of separately synthesized cDNAs are ligated to the vector ends at the same time, namely that according to Okayama-Berg, the necessary circularization of the vector DNA with the cDNA attached at one end is relatively efficient via the linker because only one juncture between the cDNA and vector molecules remains to undergo ligation.
Furthermore, the overall Okayama-Berg approach offers additional advantages over previous methods. Following circularization, a process called "RNA nick translation" using DNA polymerase I and RNase H is used which facilitates complete synthesis of the second strand along the entire first strand. This process overcomes the inherently low processivity of DNA polymerase I by using multiple sites for priming of second strand DNA synthesis with DNA primer fragments having random sequences.
Finally, since the Okayama-Berg vector has already been joined to the cDNA when the second strand is synthesized, truncation of cDNA molecules close to the 3' end of the cDNA generally does not occur, in contrast to other methods in which the second strand is completed while the 3' end of the first strand is free and, therefore, more susceptible to damage from nuclease activities.
Cloning vectors based on bacteriophage .lambda. are also known. The second strand synthesis reaction of the Okayama-Berg method has also been utilized in a simpler cloning procedure [Gubler, U. and Hoffman, B. J. (1983) Gene 25, 253-269], allowing cDNA cloning in such .lambda. vectors [Huynh, T. V., Young, R. A. and Davis, R. W. (1985) in DNA Cloning, A Practical Approach, ed. Glover, D. (IRL, Oxford), Vol. I, pp. 49-78]. This .lambda.-based cDNA cloning method has been widely used, mainly due to the high efficiency of transmission of recombinant DNA into cells by means of infectious phage particles, which are produced with in vitro DNA "packaging" systems. .lambda. phage cloning systems also offer convenient clone screening capabilities due to tolerance of a high density of .lambda. plaques on test plates to be screened, compared with most plasmid systems which permit only lower densities of bacterial host colonies.
Early .lambda. systems for cDNA cloning, however, while retaining the second strand synthesis strategy of the original Okayama-Berg plasmid method, lack some of its other advantages. For example, directional cloning is not possible in those original .lambda. systems. In addition, multiple inserts and truncated cDNAs are frequently obtained. Further, despite the high packaging efficiency for native .lambda. DNA molecules, the packaging efficiency of recombinant DNA molecules that are produced by cleavage of intact linear .lambda. molecules and ligation with cDNA fragments is usually low compared to that of intact .lambda. DNA.
Recently, directional cloning capabilities have been introduced into various .lambda. vectors. For example, one such directional .lambda. vector employs a site for insertion of DNA segments that comprises two different restriction enzyme cleavage sites [Meissner, P. S., et al. (1987) Proc. Nat. Acad. Sci. USA, 84, 4171-4175]. The cDNA molecules are primed with oligo(dT), made double-stranded, and then methylated with the enzymes needed for protection against internal cleavage by both of the nucleases used in the DNA insertion site of the vector. A linker segment containing a cleavage site for only one of the nucleases of the insertion site is added to both ends of the cDNA. The combination of the last two A:T base pairs on the 3' end of the cDNA with the sequences at one end of the linker, however, creates a cut site for the other of the two nucleases of the insertion site. Thus, after restriction with both nucleases of the insertion site, the individual cDNA segments can ligate into the vector only in a single direction with respect to the two different cleavage sites in the vector.
Various general disadvantages of this particular approach for cDNA cloning in .lambda. phage, compared to the Okayama-Berg plasmid method, have been described above in relation to other systems; and other problems specific to this approach have been noted [Meissner, P. S., et al. (1987), supra]. Nevertheless, it was reported that one cDNA library constructed by this method, starting from 5 .mu.g of mRNA, contained about 2.times.10.sup.8 clones with 8 of 10 having cDNA inserts (i.e., the reported cloning efficiency was about 3.times.10.sup.7 recombinants per .mu.g of poly(A)+RNA).
Directional cloning in other .lambda. phage vectors has also been reported [Palazzolo, M. J. and Meyerowitz, E. M. (1987) Gene 52, 197-206]. [These vectors are known as .lambda.SWAJ or .lambda.GEM, certain variants of which (LambdaGEM.TM.2 and LambdaGEM.TM.4) are commercially available from Promega Corporation of Madison, Wis. The .lambda.GEM type of vectors are also examples of a composite vector comprising both a .lambda. phage genome and an embedded plasmid (GEM)]. The directional cloning scheme in these .lambda. vectors utilizes two different restriction enzyme cleavage sites at the site for insertion of DNA. Thus, for example, to attach the end of a cDNA corresponding to the poly (A) end of the mRNA to a particular end of the cleaved vector DNA that has a sticky end for the restriction enzyme SacI, a synthetic DNA "linker-primer" segment is used which combines a single-stranded oligo(dT) primer with a restriction site for the enzyme SacI. After second strand synthesis, a linker segment with the site of a second restriction enzyme is ligated to the other end of the cDNA, which is then restricted with both enzymes of the insertion site of the vector, according to much the same strategy as described for the previous example of a directional .lambda. phage vector.
This particular approach for directional cloning in a .lambda. vector, however, cannot be used to obtain full-length cDNAs of certain mRNAs because it requires cleavage of the cDNA molecules by the restriction enzyme SacI and a second enzyme (e.g., XbaI) without first protecting the internal sites for these enzymes by appropriate methylation. [In an alternative version of the scheme reported by Palazzolo and Meyerowitz, supra, the XbaI enzyme was replaced by EcoRI and the cDNA was methylated to protect against only this one enzyme.]Sites for these particular enzymes occur frequently by chance in natural nucleotide sequences. Thus, restriction of cDNAs with enzymes like these, as taught in this approach, causes truncation of cDNA inserts with internal SacI (and/or XbaI) sites. In relation to cloning efficiency, it may be noted that this publication described a single cDNA library constructed by this method, starting from 1 .mu.g of poly(A).sup.+ RNA, that contained about 1.6.times.10.sup.6 clones with cDNA inserts. In addition to the publications on directional cloning systems described above, there is a report which describes a non-directional plasmid-based system that uses an efficient oligonucleotide-based strategy to promote cDNA insertion into the vector [Aruffo, A. and Seed, B. (1987) Proc. Nat. Acad. Sci. USA 84, 8573-8577]. This method uses synthetic DNA adaptors that encode a recognition site for a particular restriction enzyme, BstXI, which has a variable recognition sequence, as illustrated below: ##STR1## where A, T, G and C indicate nucleotides having the DNA bases adenine, thymine, guanine, and cytosine, respectively (for which the pairs A:T and G:C are complementary), and N and N represent bases that are included within the recognition site sequence but that can be any of the usual DNA bases, provided only, of course, that each N and the corresponding N on the opposite DNA strand be complementary. The arrows (.dwnarw. and .Arrow-up bold.) indicate the cleavage sites on the upper and lower DNA strands, respectively. Accordingly, cleavage of the BstXI site creates a 4-base single-stranded extension (sticky end) on the 3' end that varies from site to site.
The report above discloses a plasmid vector with a site for insertion of DNA segments in which two identical BstXI sites were placed in inverted orientation with respect to each other and were separated by a short replaceable segment of DNA. Inversion of a DNA sequence consists of representing the base sequence of each strand, conventionally expressed in the 5' to 3' direction of the polynucleotide backbone, in a DNA strand with the same base sequence presented in the 3' to 5' direction (e.g, inversion of the DNA sequence 5'-ACTG-3' produces the DNA sequence 3'-ACTG-5' or, in the conventional 5' to 3' format, 5'-GTCA-3'.
With the particular BstXI recognition sequence that was employed in this vector, the 4-base single-stranded ends of the inverted sites created on the two ends of the vector DNA by restriction with the BstXI enzyme were not able to anneal with one another. This situation is illustrated below, where two identical sites, one inverted relative to the other and separated by an unspecified sequence (N . . . N), are shown; the sticky ends of the vector produced by cleavage with the BstXI enzyme are shown in bold print: ##STR2## (Note that the reference does not specify the entire BstXI recognition sequence that was used; only the sequence of the sticky end is clearly defined, as indicated below by inclusion of the N symbol where necessary).
Inspection of these single-stranded end sequences on this plasmid vector reveals that they are identical, due to the inversion of one of the sites relative to the other. Thus, the ends of the vector with inverted and non-inverted copies of this particular BstXI restriction site sequence cannot anneal with each other. Similarly, the restricted ends of the spacer DNA segment between these two sites will be identical. Accordingly, to clone cDNA segments in this vector, a synthetic adaptor was attached to each end at the double-stranded stage, by blunt end ligation, giving them the same termini as the replaceable segment that was removed from the vector with BstXI. The specific adaptor used in the above report comprises the following oligonucleotide sequences:
5'-CTTTAGAGCACA-3' PA1 3'-GAAATCTC -5'. PA1 (i) annealing a linker-primer DNA segment comprising a single-stranded oligonucleotide which has oligo(dT) at the 3' end, and a single-stranded extension at the 5' end that is included in a first non-symmetrical restriction enzyme recognition sequence. PA1 (ii) enzymatically synthesizing the first strand of the cDNA from the linker-primer that is annealed with the mRNA molecule; PA1 (iii) enzymatically synthesizing the second strand of the cDNA using the first strand as the template under conditions such that single-stranded extensions on the synthesized cDNA molecule are made double-stranded; PA1 (iv) ligating onto the blunt-ended cDNA resulting from synthesizing the second strand, an adaptor DNA segment comprising a second non-symmetrical restriction enzyme recognition sequence that is nonidentical to the first non-symmetrical restriction enzyme recognition sequence; PA1 (v) exposing the cDNA resulting from ligation with the adaptor to one or more restriction enzymes that can cleave the first and second non-symmetrical restriction enzyme recognition sequences under conditions such that both of these sequences are cleaved, resulting in the vector DNA having two single-stranded ends that are not complementary; PA1 (vi) ligating the cDNA resulting from cleavage with the enzymes to DNA of a genetic cloning vector, where the vector comprises PA1 and where in the vector DNA, at least two non-symmetrical restriction enzyme recognition sequences have been cleaved by one or more enzymes that can cleave those recognition sequences, resulting in vector DNA having two single-stranded ends that are not complementary; wherein further, PA1 (vii) transforming a suitable host cell with the recombinant DNA segment comprising the cDNA and the vector DNA that results from the ligation of cDNA to vector DNA; and PA1 (viii) identifying a clone of host cells, resulting from transformation with said recombinant DNA, that contains a recombinant DNA segment including said cDNA.
Obviously, addition of this single adaptor to both ends of the cDNA segments would provide those segments with ends (in bold type) that could anneal and subsequently ligate efficiently to both identical vector ends.
Thus, Aruffo and Seed, 1987, supra, discloses a method using this particular BstXI recognition site sequence, whereby neither the cDNA (with attached adaptors) nor the isolated vector DNA (after being freed from the replaceable segment after cleavage with BstXI) was able to ligate to itself. This work, however, neither teaches nor suggests general requirements for a BstXI recognition sequence, or for those of other restriction enzymes, to be usable in this cloning approach.
Further, as these workers pointed out, their strategy did not provide a directional cloning capability. After first alleging that such directional capability was not needed, they admitted that, nonetheless, they had devoted considerable unsuccessful efforts to developing an alternative means of producing mRNA from every cDNA clone, namely a bidirectional transcription capability whereby both strands of an inserted cDNA would be transcribed. They concluded that this goal cannot be easily attained, at least not in their cloning host system. The authors stated, moreover, that they could obtain cloning efficiencies with their plasmid that were between 0.5 and 2.times.10.sup.6 recombinants per .mu.g of mRNA, which were said to compare favorably with those described for certain cloning systems based on phage .lambda.. In the only example of a cDNA library described in this reference, however, the yield of cDNA clones obtained by this method was actually stated to be only .apprxeq.3.times.10.sup.5 recombinants from 0.8 .mu.g poly (A) -containing RNA (i.e., less than 0.4.times.10.sup.6 recombinants per .mu.g poly(A)-containing RNA).
Thus, there has been a continuing need for methods and vectors which would provide a higher yield of cDNA clones from limited amounts of eukaryotic mRNAs while also providing an improved means of directing orientation of inserted cDNA fragments within vector DNAs.