The invention concerns a method for the enrichment and isolation of genes of interest from recombinant DNA libraries.
Recombinant DNA (including cDNA and genomic) libraries consist of a large number of recombinant DNA clones, each containing a different segment of foreign DNA. In order to ensure that a recombinant cDNA library contains at least one copy of each mRNA in the cell, it generally needs to include between about 500,000 and 1,000,000 independent cDNA clones. Current Protocols in Molecular Biology, Ausubel et al., editors, Greene Publishing Associates and Wiley-Interscience, N.Y., 1991, vol. 1, Unit 5.8.1. Similarly, a genomic library with a base of about 700,000 clones is required to obtain a complete library of mammalian DNA. Ausubel et al., supra, Unit 5.7.1. While the frequency of different genes in any particular library varies, most genes will be present at a frequency of about 1 part in 103 to 106. A particularly rare mRNA will be represented by a single clone out of 106 clones, while the majority of the genes will be present at a frequency of 1 in 104 to 105 clones.
The identification and isolation of any desired recombinant DNA clone from among such a daunting number of total clones is not an easy task. Over the past 25 years, several cloning methods have been developed. In most cases, desired clones are identified by screening DNA libraries with nucleic acid probes, ligands or antibodies. Usually, libraries are introduced into host cells, plated out, colonies transferred to nitrocellulose filters, and hybridized to 32P-labeled probes or bound to antibodies. Such filter hybridization methods (see Sambrook et al., 1989, infra) do not involve an enrichment step. In order to clone a particular gene, sometimes as many as one million clones must be screened. Subtractive hybridization techniques have also been used to isolate target DNA. In this technique, the cDNA molecules created from a first population of cells is hybridized to cDNA or RNA of a second population of cells in order to xe2x80x9csubtract outxe2x80x9d those cDNA molecules that are complementary to nucleic acid molecules present in the second population that reflect nucleic acid molecules present in both populations, therefore leaving only molecules unique to the population of interest.
Inverse polymerase chain reaction (IPCR) has been described, see Ochman et al. Genetic applications of an inverse PCR reaction, Genetics 120: 621 (1988)). In IPCR, the primers are oriented in the reverse direction of the usual orientation in conventional PCR, i.e., the two primers extend away from each other. Inverse PCR was originally used to amplify uncharacterized sequences immediately flanking transposable elements. Inverse PCR has been used to isolate a target gene; however, there is no selection or enrichment for the target gene in conventional inverse PCR protocols, resulting in a high background of colonies.
Li et al. in U.S. Pat. Nos. 5,500,356, and 5,789,166, describe a method of isolating a desired target nucleic acid from a nucleic acid library, that involves the use of biotinylated probes and enzymatic repair-cleavage to eliminate the parental template nucleic acid of the library. This method requires a single stranded nucleic acid library (M13 phagemid library). If the library consists of double stranded plasmids, single strands have to be prepared initially. A biotinylated oligonucleotide probe is hybridized to a target sequence within the single-stranded molecules. This hybridized complex is then captured on avidin-coated beads and the library recovered from the beads by denaturation of the hybridized molecules. This selection eliminates undesired single-stranded phagemid DNA. This method of cloning does not involve amplification of the target gene before the selection step. The recovered single-stranded DNA is converted to ds DNA in the presence of dNTPs (but not dUTP) and then the mixture digested with the enzyme HhaI that digests away residual ss DNA that contain dUTP. Transformation and isolation of the desired molecule follows.
PCR based site-directed mutagenesis has been used to create a desired mutation such as a point mutation, deletion or insertion. In the site-directed mutagenesis method of Bauer et al., U.S. Pat. No. 5,789,166, the starting DNA template for the PCR amplification is typically a homogeneous population of plasmids all containing the one insert of interest that is to be mutated. Both oligonucleotide primers for such a PCR reaction are mutagenic primers which must contain the desired mutation; for point mutations, these primers are designed to contain at least one mismatched base relative to the template which upon primer extension will result in the desired mutation of the target gene. For the point mutations, the primers are overlapping, i.e., they need to anneal to the same sequence on opposite strands of the plasmid. For deletion mutagenesis, the primers are designed such that there is a gap between the 5xe2x80x2 ends of the primer pair. Thus, the product of the primer extension has a gap in the sequence of the target gene corresponding to the sequence to be deleted. Mutated plasmids containing the desired mutation are selected for and transformed into competent bacteria.
From the above discussion, it is apparent that there is a need for a cloning method that is versatile, easier to perform and less laborious, that can provide higher throughput, and is economical. The present invention overcomes the limitations of conventional cloning methods and provides additional advantages that will be apparent from the detailed description below.
The invention provides a method of amplifying and isolating a nucleic acid molecule of interest from a mixture of nucleic acid molecules, comprising:
(i) providing a recombinant nucleic acid library with a heterogeneous population of methylated, circular nucleic acid molecules as template molecules;
(ii) annealing a first and a second primer to complementary strands of the circular nucleic acid molecule, to produce an annealed mixture, wherein
the two primers in the 5xe2x80x2 to 3xe2x80x2 direction, extend in opposite directions relative to each other during polymerase chain reaction;
the 5xe2x80x2 ends of the two primers are adjacent to each other; and
wherein each primer is identical in sequence to its corresponding sequence in the nucleic acid molecule of interest;
(iii) subjecting the annealed mixture to polymerase chain reaction, thereby producing an amplified mixture containing linear amplicons;
(iv) digesting the amplified mixture with an enzyme that selectively cleaves methylated DNA, thereby eliminating the template molecules and enriching the nucleic acid molecule of interest; and
(v) isolating the nucleic acid molecule of interest.
In one embodiment, the first and second primers are provided phosphorylated on their 5xe2x80x2 ends. In this embodiment, the method will comprise a ligation step after the PCR step but prior to the digesting step, to ligate the linear amplicons to produce circular replicons. Alternatively, after the digestion step, amplicons larger than the size of the cloning vector of the nucleic acid library are isolated such as by gel purification, and the isolated amplicons ligated.
In a preferred embodiment, the enzyme used to digest the amplified mixture is a restriction endonuclease. In a specific embodiment, the restriction endonuclease is Dpn I.
Multiple recombinant nucleic acid libraries can be mixed into a single reaction mix for polymerase chain reaction. In addition, multiple nucleic acid molecules of interest can be cloned simultaneously by applying aliquots of a solution of the mixed libraries to wells of a 96-well microtiter plate and performing the polymerase chain reaction on the microtiter plate.
In one embodiment, the nucleic acid library is a double-stranded DNA library wherein the DNA is methylated. Preferred DNA libraries are human cDNA or human genomic DNA libraries. Preferably, the nucleic acid molecule of interest is represented in the library at a frequency of greater than 5xc3x97105, even more preferably, at a frequency of equal to or greater than 1xc3x97106.
The method of the above embodiments further comprises the step of transforming the circular replicons into a suitable host cell after the digesting step, to generate clones. In one embodiment, the host cell is a competent bacterial host cell. The clones are then screened to identify the clone containing the nucleic acid molecule of interest. In an alternative embodiment, the screening step is omitted and the clones are directly sequenced. Sequencing is performed on nucleic acid isolated from the clones. Nucleic acid from multiple clones can be pooled for the sequencing step.
The invention provides a method of amplifying and isolating a nucleic acid molecule of interest from a recombinant nucleic acid library, comprising:
(i) providing a recombinant nucleic acid library with a heterogeneous population of methylated, circular nucleic acid molecules as template molecules;
(ii) annealing a first and a second primer to complementary strands of the circular nucleic acid molecule, to produce an annealed mixture, wherein
the two primers extend in opposite directions relative to each other during polymerase chain reaction;
the 5xe2x80x2 ends of the two primers are phosphorylated and adjacent to each other; and
wherein each primer is identical in sequence to its corresponding sequence in the nucleic acid molecule of interest;
(iii) subjecting the annealed mixture to polymerase chain reaction, thereby producing an amplified mixture containing amplicons;
(iv) ligating the amplicons in a ligation mix to produce circular replicons;
(v) subjecting the ligation mix after ligation to digestion with an enzyme that selectively cleaves methylated nucleic acid, thereby eliminating the template molecules and enriching the nucleic acid molecule of interest;
(vi) transforming the replicons into a suitable host cell to generate transformed clones; and
(vii) isolating the nucleic acid molecule of interest.
In one embodiment of the preceding method, a screening step is provided to screen the transformed clones to identify the clone containing the nucleic acid molecule of interest.
In any of the above embodiments, greater than 50 cDNA libraries can be provided in a single mix for polymerase chain reaction.