The present invention relates to recombinant DNA which encodes the NcoI restriction endonuclease and modification methylase, and to methods for the production of these enzymes from the recombinant DNA.
Many bacteria contain systems which guard against invasion of foreign DNA. Bacterial cells contain specific endonucleases that make double-strand scissions in invading DNA unless the DNA has been previously modified, usually by the corresponding DNA methylase. The endonuclease with its accompanying methylase is called a restriction-modification system (hereinafter "R-M system"). The principle function of R-M systems is thus defensive: they enable bacterial cells to resist infections by bacteriophage and plasmid DNA molecules which might otherwise parasitize them.
Three distinct types of R-M systems have been characterized on the basis of the subunit compositions, co-factor requirements, and type of DNA cleavage. Type I R-M systems are the most complex. The endonuclease typically contains three different types of subunits and require Mg.sup.++, ATP, and S-adenosyl-methionine for DNA cleavage. Their recognition sites are complex, and DNA cleavage occurs at non-specific sites anywhere from 400-7000 base pairs from the recognition site.
Type III R-M systems are somewhat less complex. The endonuclease of type III R-M systems contain only two types of subunits, and although Mg.sup.++ and ATP are required for DNA cleavage, S-adenosyl-methionine stimulates enzymatic activity without being an absolute requirement. DNA cleavage occurs distal to the recognition site by about 25-27 base pairs.
Type II R-M systems are much simpler than either types I or III. The endonuclease only contains one subunit, and only Mg.sup.++ is required for DNA cleavage. Moreover, the DNA cleavage site occurs within or adjacent to the enzyme's recognition site. It is this class of restriction endonucleases that has proved most useful to molecular biologists.
Bacteria usually possess only a small number of restriction endonucleases per species. The endonucleases are named according to the bacteria from which they are derived. Thus, the species Haemophilus aegyptius, for example, synthesizes three different restriction endonucleases, named Hae I, Hae II and Hae III. These enzymes recognize and cleave the sequences (AT)GGCC(AT), PuGCGCPy and GGCC respectively. Escherichia coli RY13, on the other hand, synthesizes only one enzyme, EcoR I, which recognizes the sequence GAATTC.
Restriction endonucleases, the first component of R-M systems, have been characterized primarily with respect to their recognition sequence and cleavage specificity because of their practical use for molecular dissection of DNA. The majority of restriction endonucleases recognize sequences 4-6 nucleotides in length. More recently, recognition endonucleases having recognition sequences of 7-8 nucleotides in length have been found. Most, but not all, recognition sites contain a dyad axis of symmetry, and in most cases, all the bases within the site are uniquely specified. This symmetrical relationship in the recognition sequence of restriction endonucleases has been termed "palindromes." Some restriction endonucleases have degenerate or relaxed specificites in that they can recognize multiple bases at the same positions. EcoRI, which recognizes the sequence GAATTC is an example of a restriction endonuclease having a symmetrical relationship, while HaeII, which recognizes the sequence PuGCGCPy, typifies restriction endonucleases having a degenerate or relaxed specificity. Endonucleases with symmetrical recognition sites generally cleave symmetrically within or adjacent the recognition site, while those that recognize asymmetric sites tend to cut at distance from the recognition site, typically from about 1-13 base pairs away from that site.
The second component of bacterial R-M systems are the modification methylases. These enzymes are complementary to restriction endonucleases and provide the means by which bacteria are able to protect their own DNA and distinguish it from foreign, infecting DNA. Modification methylases recognize and bind to the same nucleotide recognition sequence as the corresponding restriction endonuclease, but instead of breaking the DNA, they chemically modify one or more of the nucleotides within the sequence by the addition of a methyl group. Following methylation, the recognition sequence is no longer bound or cleaved by the corresponding restriction endonuclease. The DNA of a bacterial cell is always modified by virtue of the activity of its modification methylase, and it is therefore insensitive to the presence of the endogenous restriction endonuclease. It is only unmodified, and therefore identifiably foreign DNA that is sensitive to restriction endonuclease recognition and attack.
More than 1000 different restriction endonucleases have been isolated from bacterial strains, and many share common specificites. Restriction endonucleases which recognize identical sequences are called "isochizomers." Although the recognition sequences of isochizomers are the same, they may vary with respect to site of cleavage (e.g., Xma I V. Sma I Endow et al., J.Mol.Biol. 112:521 (1977) Waalwijk et al., Nucleic Acids Res. 5:3231 (1978)) and in cleavage rate at various sites (Xho I v. Pae R7I Gingeras et al., Proc. Natl. Acad. Sci U.S.A. 80:402 (1983)).
With the advent of genetic engineering technology, it is now possible to clone genes and to produce the proteins and enzymes that they encode in greater quantities than are obtainable from their natural sources by conventional purification techniques.
Type II restriction-modification systems are being cloned with increasing frequency. Four methods are being used to clone R-M systems into E. coli: (1) sub-cloning of natural plasmids; (2) selection based on phage restriction; (3) selection based on vector modification; and (4) multi-step isolation.
The first cloned systems used bacteriophage infection as a means of identifying or selection restriction endonuclease clones (Hha II: Mann, et al., Gene 3: 97-112, (1978); EcoR II: Kosykh, et al., Molec. Gen. Genet. 178: 717-719, (1980); Pst I: Walder, et al., Proc. Nat. Acad. Sci. USA 78: 1503-1507, (1981)). Since the presence of R-M systems in bacteria enables them to resist infection by bacteriophages, cells that carry cloned R-M genes can, in principle, be selectively isolated as survivors from libraries that have been exposed to phage. This method has been found, however, to have only limited value. Specifically, it has been found that cloned R-M genes do not always manifest sufficient phage resistance to confer selective survival.
Subcloning of natural plasmids involves transferring systems initially characterized as plasmid-borne into E. coli cloning plasmids (EcoRV: Bougueleret, et al., Nucleic Acids Res. 12: 3659-3676, (1984); PaeR7: Gingeras and Brooks, Proc. Natl. Acad. Sci. USA 80: 402-406, (1983); Theriault and Roy, Gene 19: 355-359, (1982); Pvu II: Blumenthal, et al., J. Bacteriol. 164: 501-509, (1985)). In this approach the plasmids are purified prior to digestion and ligation, so reducing the complexity of the source DNA. Isolating the system then involves sub-cloning and characterizing libraries and perfoming selections. This approach also has a number of limitations including that most R-M systems are located on the bacterial chromosome, not plasmids.
Vector modification, the most successful approach to date, is predicated on the assumption that the restriction and modification genes of a particular type II system are linked and are expressed sequentially, methylase and then endonuclease. Thus, in a population of methylase positive clones, some clones should also carry the corresponding endonuclease gene. This approach, known as methylase selection, was first used successfully by Wilson, EPO Publication No. 0193413, to clone the Hae II, Taq I, Ban I, Hind III, Hinf I, and Msp I R-M systems.
A number of R-M systems, however, have required a multi-step cloning approach. For example, during acquisition of a new R-M system, it has been found that a number of cells face an establishment problem. Unless the methylase has a head start over the endonuclease, the cell is in danger of cleaving its own cellular DNA. E. coli appears to cope with this problem by repairing its DNA, and is able to assimilate many cloned R-M systems without apparent trauma. Not all systems are easily assimilated however. The Dde I and BamH I R-M systems, for example, could not be cloned in a single step; rather, three steps were required (Howard et al., Nucleic Acids Res. 14:7939-7951 (1988)). There are, in fact, many systems for which only the methylase gene has been cloned. These systems may be similar to BamH I and Dde I, and may require similar approaches.
While a number of clones have been obtained by one or more of the above-described methods, see, Wilson, Gene 74, 281-289 (1988), cloning of type II R-M systems is not without difficulty. In particular, the genetics of many R-M systems have been found to be more complex, and methylase positive clones obtained by, for example, vector modification have not yielded the corresponding endonuclease gene. See, Wilson, Trends in Genetics 4, 314-318 (1988); Lunnen et al., Gene 74, 25-32 (1988). In fact, numerous obstacles are encountered in the process of cloning R-M systems using vector modification. For example, in some systems, the methylase and endonuclease genes may not be linked or the endonuclease used to fragment the bacterial DNA may cut either or both of the R-M genes. In other systems, such as BamH I and Dde I, the methylase may not protect sufficiently against digestion by the corresponding endonuclease, either because of inefficient expression in the transformation host, or because of the inherent control mechanism for expression of the methylase and endonuclease genes, or for unknown reasons. Modification may also be harmful to the host cell chosen for transformation. The endonuclease sought to be cloned may not be available in sufficient purity or quantity for methylase selection. In many systems, difficulties are also encountered in expressing the endonuclease gene in a transformation host cell of a different bacterial species.
In spite of the difficulties in cloning the more complex Type II R-M systems, it has been possible to obtain some endonuclease genes by modifying the vector modification selection method (see Lunnen et al., op. cit.) and/or by using a multi-step cloning approach. For example, formation of multiple libraries, construction of new cloning vectors, use of isochizomers for the methylase selection step, mapping of methylase and/or endonuclease genes to determine the corresponding DNA sequences for use as hybridization probes, and other variations to the above-described approaches have yielded a number of recalcitrant recombinant R-M systems.
However, at the outset of any type II R-M cloning project, one simply does not know which, if any, and what variations or modifications to previous approaches may be required to clone any particular R-M system. For example, the detailed genetics of the particular system is usually unknown. Type II R and M genes may be present on the genome in any of four possible arrangements. Wilson, Trends in Genetics, supra. The sizes of the enzymes, and of the corresponding genes, vary widely between one R-M system and another, as do the DNA and amino acid sequences. In fact, even isochizomeric restriction endonucleases have been found to display few similarities. Id, at 318, see also Chandrasegeran et al., Structure and Expression, Vol. I, pp 149-156, Adenine Press (1988).
Mechanisms of control of R and M gene expression also vary widely among type II systems. For example, expression of the endonuclease gene may be modification-dependent, as is indicated in the Ava II, Hae II, Hinf I, PstI and Xba I systems. Alternatively, the endonuclease gene may contain a large number of its own recognition sites as compared to the corresponding methylase gene, as in the Taq I system.
During transformation of cells to obtain clones carrying the target R-M system, cellular DNA is initially unmodified and consequently in danger of being digested by the target endonuclease. Transformation host cells must either contain DNA repair systems or be able to delay expression of the target endonuclease gene until modification is complete. If neither of these mechanisms is available to the transformation host, a problem is encountered in establishing the cloned genes in the host. As noted above, when establishment problems were encountered in cloning the Dde I and BamH I systems, it was necessary to introduce the methylase and endonuclease genes sequentially, to protect the DNA of the transformation host cells (Howard, K. A. et al., supra, Brooks et al., Gene 74: 13 (1988)). However, some R-M systems have resisted all attempts to clone them, and others have yielded only the methylase gene, possibly because of establishment difficulties. Wilson, Trends in Genetics 4, 317.
It has been found that transformation host cells may also contain systems that restrict foreign types of modification. For example, two systems have been identified in E. coli which restrict modified DNAs: the mcr system restricts DNA containing methyl-cytosine, and the mrr system restricts DNA containing methyl-adenine. It is therefore usually necessary to use E. coli strains that are defective in these systems. The presence of additional host cell restriction systems may also be responsible for the difficulties encountered in cloning of R-M systems.
Because purified restriction endonucleases, and to a lesser extent, modification methylases, are useful tools for characterizing and rearranging DNA in the laboratory, there is a commercial incentive to obtain strains of bacteria through recombinant DNA techniques that synthesize these enzymes in abundance. Such strains would be useful because they would simplify the task of purification as well as providing the means for production in commercially useful amounts.