The present invention relates to recombinant DNA that encodes the PpuMI restriction endonuclease (PpuMI endonuclease or PpuMI) as well as PpuMI methyltransferase (PpuMI methylase or M.PpuMI), and expression of PpuMI endonuclease and methylase in E. coli cells containing the recombinant DNA.
PpuMI endonuclease is found in the bacterium Pseudomonas putida (NEB#372, New England Biolabs, Beverly, Mass.). It recognizes the double-stranded DNA sequence 5xe2x80x2RG/GWCCY3xe2x80x2 (W=A or T, R=A or G, Yxe2x95x90C or T, / indicates the cleavage position) and cleaves between the two guanines to generate 3-base cohesive ends. Due to degeneracy at the central position of the recognition sequence, the cohesive ends derived from two different PpuMI sites may or may not be complementary. PpuMI methylase (M.PpuMI) is also found in the same strain and it recognizes the same DNA sequence as PpuMI endonuclease. M.PpuMI displays homology to the C5-cytosine DNA methyltransferase family. Therefore, M.PpuMI presumably methylates the C5 position of one of the cytosines present within the recognition sequence to protect DNA from PpuMI endonuclease cleavage. The substrate for M.PpuMI may be non-methylated or hemi-methylated DNA.
Type II restriction endonucleases are a class of enzymes that occur naturally in bacteria and in some viruses. When they are purified away from other bacterial/viral proteins, restriction endonucleases can be used in the laboratory to cleave DNA molecules into small fragments for molecular cloning and gene characterization.
Restriction endonucleases recognize and bind particular sequences of nucleotides (the xe2x80x98recognition sequencexe2x80x99) along DNA molecules. Once bound, they cleave the molecule within (e.g. BamHI), to one side of (e.g. SapI), or to both sides (e.g. TspRI) of the recognition sequence. Different restriction endonucleases have affinity for different recognition sequences. Over two hundred and twenty-eight restriction endonucleases with unique specificities have been identified among the many hundreds of bacterial species that have been examined to date (Roberts and Macelis, Nucl. Acids Res.29:268-269 (2001)).
Restriction endonucleases typically are named according to the bacteria from which they are discovered. Thus, the species Deinococcus radiophilus for example, produces three different restriction endonucleases, named DraI, DraII and DraIII. These enzymes recognize and cleave the sequences 5xe2x80x2TTT/AAA3xe2x80x2, 5xe2x80x2RG/GNCCR3xe2x80x2 and 5xe2x80x2CACNNN/GTG3xe2x80x2 respectively. Escherichia coli RY13, on the other hand, produces only one enzyme, EcoRI, which recognizes the sequence 5xe2x80x2G/AATTC3xe2x80x2.
It is thought that in nature, restriction endonucleases play a protective role in the welfare of the bacterial cells. The enzymes cleave invading foreign DNA molecules such as plasmids or viral DNA that would otherwise destroy or parasitize the bacteria while the host bacterial DNA remains intact. The cleavage that takes place disables many of the infecting genes and renders the DNA susceptible to further degradation by non-specific nucleases.
A second component of the bacterial protective systems are the modification methylases that protect host DNA from cleavage with restriction endonuclease with which they coexist. The restriction endonuclease and modification methylase form the restriction-modification (R-M) system. The methylase provide the means by which bacteria are able to protect their own DNA and distinguish it from foreign DNA. Modification methylases recognize and bind to the same recognition sequence as the corresponding restriction endonuclease, but instead of cleaving the DNA, they chemically modify one particular nucleotide within the sequence by the addition of a methyl group to produce C5 methyl cytosine, N4 methyl cytosine, or N6 methyl adenine. Following methylation, the recognition sequence is no longer cleaved by the cognate restriction endonuclease. The DNA of a bacterial cell is always fully modified by the activity of its modification methylase. It is therefore completely insensitive to the presence of the endogenous restriction endonuclease. Only unmodified, and therefore identifiable foreign DNA, is susceptible to restriction endonuclease recognition and cleavage. During and after DNA replication, usually hemi-methylated DNA (DNA methylated on one strand) is also resistant to the cognate restriction endonuclease.
With the advancement of recombinant DNA technology, it is now possible to clone restriction-modification genes and overproduce the enzymes in large quantities. The key to isolating clones of restriction-modification genes is to develop an efficient method to identify such clones within genomic DNA libraries, (i.e. populations of clones derived by xe2x80x98shotgunxe2x80x99 procedures) when they occur at frequencies as low as 10xe2x88x923 to 10xe2x88x924. Preferably, the method should be selective, such that the unwanted clones with non-methylase inserts are destroyed while the desirable rare clones survive.
A large number of type II restriction-modification systems have been cloned. The first cloning method used bacteriophage infection as a means of identifying or selecting restriction endonuclease clones (EcoRII: Kosykh et al., Mol. Gen. Genet. 178:717-719 (1980); HhaII: Mann et al., Gene 3:97-112 (1978); PstI: Walder et al., Proc. Nat. Acad. Sci. 78:1503-1507 (1981)). Since the expression of restriction-modification systems in bacteria enables them to resist infection by bacteriophages, cells that carry cloned restriction-modification genes can, in principle, be selectively isolated as survivors from genomic DNA libraries that have been exposed to phage. However, this method has been found to have only a limited success rate. Specifically, it has been found that cloned restriction-modification genes do not always confer sufficient phage resistance to achieve selective survival.
Another cloning approach involves transferring systems initially characterized as plasmid-borne into E. coli cloning vectors (EcoRV: Bougueleret et al., Nucl. Acids. Res. 12:3659-3676 (1984); PaeR7: Gingeras and Brooks, Proc. Natl. Acad. Sci. USA 80:402-406 (1983); Theriault and Roy, Gene 19:355-359 (1982); PvuII: Blumenthal et al., J. Bacteriol. 164:501-509 (1985); Tsp45I: Wayne et al. Gene 202:83-88 (1997)).
A third approach is to select for active expression of methylase genes (methylase selection) (U.S. Pat. No. 5,200,333 and BsuRI: Kiss et al., Nucl. Acids. Res. 13:6403-6421 (1985)). Since restriction-modification genes are often closely linked together, both genes can often be cloned simultaneously. This selection does not always yield a complete restriction system however, but instead yields only the methylase gene (BspRI: Szomolanyi et al., Gene 10:219-225 (1980); BcnI: Janulaitis et al., Gene 20:197-204 (1982); BsuRI: Kiss and Baldauf, Gene 21:111-119 (1983); and PstI: Walder et al., J. Biol. Chem. 258:1235-1241 (1983)).
A more recent method, the xe2x80x9cendo-blue methodxe2x80x9d, has been described for direct cloning of thermostable restriction endonuclease genes into E. coli based on the indicator strain of E. coli containing the dinD::lacZ fusion (Fomenkov et al., U.S. Pat. No. 5,498,535, (1996); Fomenkov et al., Nucl. Acids Res. 22:2399-2403 (1994)). This method utilizes the E. coli SOS response signal following DNA damage caused by restriction endonucleases or non-specific nucleases. A number of thermostable nuclease genes (TaqI, Tth111I, BsoBI, Tf nuclease) have been cloned by this method (U.S. Pat. No. 5,498,535). The disadvantage of this method is that some positive blue clones containing a restriction endonuclease gene are difficult to culture due to the lack of the cognate methylase gene.
There are three major groups of methyltransferases identified as C5-cytosine methylases, and the amino-transferasesxe2x80x94N4-cytosine methylases and N6-adenine methylases. (Malone et al. J. Mol. Biol. 253:618-632 (1995)). These groups of methylases derive their names from the position and the base that is modified. When a restriction site on DNA is modified (methylated) by the methylase, it is resistant to digestion by the cognate restriction endonuclease. Sometimes methylation by a non-cognate methylase can also confer DNA sites resistant to restriction digestion. For example, Dcm methylase modification of 5xe2x80x2CCWGG3xe2x80x2 (W=A or T) can also make the DNA resistant to PspGI restriction digestion. Another example is that CpG methylase can modify the CG dinucleotide of the NotI site (5xe2x80x2GCGGCCGC3xe2x80x2) and make it refractory to NotI digestion (New England Biolabs"" Catalog, page 220 (2000-2001)). Therefore methylases can be used as a tool to modify certain DNA sequences and make them uncleavable by restriction enzymes.
Type II methylase genes have been found in many sequenced bacterial genomes (GenBank, http://www.ncbi.nlm.nih.gov; and Rebase(trademark), http://rebase.neb.com/rebase). Direct cloning and over-expression of ORFs adjacent to methylase genes yielded restriction enzymes with novel specificities (Kong et al. Nucl. Acids Res. 28:3216-3223 (2000)). Thus microbial genome mining emerged as a new way of screening/cloning new type II restriction enzymes and methylases and their isoschizomers.
Because purified restriction endonucleases and modification methylases are useful tools for creating recombinant DNA molecules in the laboratory, there is a strong commercial interest to obtain bacterial strains through recombinant DNA techniques that produce large quantities of restriction enzymes and methylases. Such over-expression strains should also simplify the task of enzyme purification.
The present invention relates to a method for cloning the PpuMI restriction endonuclease gene (ppuMIR) and the PpuMI methylase gene (ppuMIM) from Pseudomonas putida into E. coli. The ppuMIR gene was cloned by inverse PCR and direct PCR from genomic DNA using oligonucleotide primers that were based on the DNA sequences obtained via methylase selection.
The initial difficulty was to select the functional PpuMI methylase gene from a plasmid library. The first plasmid library was generated by ligation of PpuMI genomic DNA fragments into pBR322 and transformation into E. coli. Plasmid pBR322 contains two PpuMI sites downstream of the tetracycline resistance gene (Tet). However, one site is blocked by dcm methylation so, effectively, only one site is useful for the methylase selection procedure. Primary library DNA was incubated with PpuMI endonuclease to select for undigested plasmids containing methylated PpuMI sites. When the challenged DNA was transformed into E. coli a small number of colonies contained pBR322 clones. However, none of the plasmid isolates from these colonies contained the ppuMIM gene. Failure to select the ppuMIM gene could have been due to many factors. The most probable reason for failure of the methylase selection procedure is inadequate expression of the methylase gene in E. coli. To address this potential problem, plasmid libraries were created in a high-copy derivative of pUC18, designated pJS105-22. Plasmid pJS105-22 contains the chloramphenicol resistance gene (Cam) in place of the ampicillin resistance gene (Amp). But most importantly, pJS105-22 contains three PpuMI sites to reduce the number of false positives attributed to incomplete digestion during the challenge step of the methylase selection. By transforming E. coli with a library of pJS105-22 plasmids, the ppuMIM gene was isolated in twenty-two of thirty-six clones resulting from the methylase selection procedure. The insert DNA of clone 2A was confirmed to contain a gene homologous to the C5-cytosine methyltransferase family. This gene was presumed to be the ppuMIM gene since it displayed homology to the Eco109I methylase, which modifies the nearly identical sequence RG/GNCCY.
The ppuMIR gene was identified by sequencing the genes adjacent to the ppuMIM gene. Inverse PCR walking identified an open reading frame downstream of ppuMIM. This 975 bp ORF starting with the first ATG (92 bp downstream from the ppuMIM stop codon) was PCR-amplified from genomic DNA and cloned into expression vector pJS12T, which was created by modification of pR976 (NEB collection, New England Biolabs; Beverly, Mass.). However, PpuMI restriction activity was not detected in the E. coli cell extract of fourteen recombinant clones. In addition, this downstream ORF did not display homology to the eco109I restriction endonuclease gene as might be expected. Consequently, the region upstream of the ppuMIM gene was sequenced to possibly identify an ORF encoding the ppuMIR gene. Located approximately 650 bp upstream of ppuMIM a significant ORF was discovered but the hypothetical protein sequence displayed similarity to transposase proteins of various bacteria. Therefore, the downstream ORF was re-evaluated. Upon re-evaluation, an in-frame GTG codon was found only 17 bp downstream from the ppuMIM stop codon. The GTG sequence codes for valine and in some cases can be used for initiation of translation. The downstream region was again PCR-amplified from genomic DNA to give an ORF of 1050 bp. In this case, the 5xe2x80x2 forward primer contained an NdeI site that created a GTG to ATG mutation and allowed cloning into the NdeI site of expression vector pJS12T. Of fourteen clones analyzed for activity, ten displayed restriction activity identical to native PpuMI endonuclease. The recombinant PpuMI R-M system [pJS12T-PpuMIR, pSYX20-PpuMIM] within E. coli host ER2502 provides 2.4xc3x97105 units per gram of wet cells.