The present invention relates to a novel type II restriction endonuclease, CstMI. CstMI consists of one polypeptide which possesses two related enzymatic functions. CstMI is an endonuclease that recognizes the DNA sequence 5′-AAGGAG-3′ and cleaves the phosphodiester bond between the 20th and 21st residues 3′ to this recognition sequence on this DNA strand, and between the 18th and 19th residues 5′ to the recognition sequence on the complement strand 5′-CTCCTT-3′ to produce a 2 base 3′ extension (hereinafter referred to as the CstMI restriction endonuclease). CstMI has a second enzymatic activity that recognizes the same DNA sequence, 5′-AAGGAG-3′, but modifies this sequence by the addition of a methyl group to prevent cleavage by the CstMI endonuclease. The present invention also relates to the DNA fragment encoding the CstMI enzyme, a vector containing this DNA fragment, a transformed host containing this DNA fragment, and a process for producing CstMI restriction endonuclease from such a transformed host. CstMI was identified as a potential endonuclease because of its amino acid sequence similarity to MmeI (see U.S. Application Publication No. US-2004-009 191 1-A1, filed concurrently herewith).
Restriction endonucleases are a class of enzymes that occur naturally in prokaryotes. There are several classes of restriction systems known, of which the type II endonucleases are the class useful in genetic engineering. When these type II endonucleases are purified away from other contaminating prokarial components, they can be used in the laboratory to break DNA molecules into precise fragments. This property enables DNA molecules to be uniquely identified and to be fractionated into their constituent genes. Restriction endonucleases have proved to be indispensable tools in modern genetic research. They are the biochemical ‘scissors’ by means of which genetic engineering and analysis is performed.
Restriction endonucleases act by recognizing and binding to particular sequences of nucleotides (the ‘recognition sequence’) along the DNA molecule. Once bound, the type II endonucleases cleave the molecule within, or to one side of, the sequence. Different restriction endonucleases have affinity for different recognition sequences. The majority of restriction endonucleases recognize sequences of 4 to 6 nucleotides in length, although recently a small number of restriction endonucleases which recognize 7 or 8 uniquely specified nucleotides have been isolated. Most recognition sequences contain a dyad axis of symmetry and in most cases all the nucleotides are uniquely specified. However, some restriction endonucleases have degenerate or relaxed specificities in that they recognize multiple bases at one or more positions in their recognition sequence, and some restriction endonucleases recognize asymmetric sequences. HaeIII, which recognizes the sequence 5′-GGCC-3′, is an example of a restriction endonuclease having a symmetrical, non-degenerate recognition sequence; HaeII, which recognizes 5′-(Pu)GCGC(Py)-3′ typifies restriction endonucleases having a degenerate or relaxed recognition sequence; while BspMI, which recognizes 5′-ACCTGC-3′ typifies restriction endonucleases having an asymmetric recognition sequence. Type II endonucleases with symmetrical recognition sequences generally cleave symmetrically within or adjacent to the recognition site, while those that recognize asymmetric sequences tend to cleave at a distance of from 1 to 20 nucleotides to one side of the recognition site. The enzyme of this application, CstMI, (along with MmeI) has the distinction of cleaving the DNA at the farthest distance from the recognition sequence of any known type II restriction endonuclease. More than two hundred unique restriction endonucleases have been identified among several thousands of bacterial species that have been examined to date.
A second component of restriction systems are the modification methylases. These enzymes are complementary to restriction endonucleases and they provide the means by which bacteria are able to protect their own DNA and distinguish it from foreign, infecting DNA. Modification methylases recognize and bind to the same nucleotide recognition sequence as the corresponding restriction endonuclease, but instead of breaking the DNA, they chemically modify one or other of the nucleotides within the sequence by the addition of a methyl group. Following methylation, the recognition sequence is no longer cleaved by the restriction endonuclease. The DNA of a bacterial cell is modified by virtue of the activity of its modification methylase and it is therefore insensitive to the presence of the endogenous restriction endonuclease. It is only unmodified, and therefore identifiably foreign, DNA that is sensitive to restriction endonuclease recognition and cleavage. Modification methyltransferases are usually separate enzymes from their cognate endonuclease partners. In some cases, there is a single polypeptide that possesses both a modification methyltransferase function and an endonuclease function, for example, Eco57I. In such cases, there is usually a second methyltransferase present as part of the restriction-modification system. CstMI, however, consists of a single polypeptide that possesses both a modification methyltransferase function and an endonuclease function but does not have a second methyltransferase peptide as part of the restriction modification system. In this regard CstMI is similar to the MmeI restriction modification system.
Endonucleases are named according to the bacteria from which they are derived. Thus, the species Haemophilus aegyptius, for example synthesizes 3 different restriction endonucleases, named HaeI, HaeII and HaeIII. These enzymes recognize and cleave the sequences 5′-(W)GGCC(W)-3′,5′-(Pu)GCGC(Py)-3′ and 5′-GGCC-3′ respectively. Escherichia coli RY13, on the other hand, synthesizes only one enzyme, EcoRI, which recognizes the sequence 5′-GAATTC-3′.
While not wishing to be bound by theory, it is thought that in nature, restriction endonucleases play a protective role in the welfare of the bacterial cell. They enable bacteria to resist infection by foreign DNA molecules such as viruses and plasmids that would otherwise destroy or parasitize them. They impart resistance by binding to infecting DNA molecules and cleaving them in each place that the recognition sequence occurs. The disintegration that results inactivates many of the infecting genes and renders the DNA susceptible to further degradation by exonucleases.
More than 3000 restriction endonucleases have been isolated from various bacterial strains. Of these, more than 240 recognize unique sequences, while the rest share common recognition specificities. Restriction endonucleases which recognize the same nucleotide sequence are termed “isoschizomers.” Although the recognition sequences of isoschizomers are the same, they may vary with respect to site of cleavage (e.g., XmaI v. SmaI, Endow, et al., J. Mol. Biol. 112:521 (1977); Waalwijk, et al., Nucleic Acids Res. 5:3231 (1978)) and in cleavage rate at various sites (XhoI v. PaeR7I, Gingeras, et al., Proc. Natl. Acad. Sci. U.S.A. 80:402 (1983)).
Restriction endonucleases have traditionally been classified into three major classes; type I, type II and type III. The type I restriction systems assemble a multi-peptide complex consisting of restriction polypeptide, modification polypeptide, and specificity, or DNA recognition, polypeptide. Type I systems require a divalent cation, ATP and S-adenylosyl-methionine (SAM) as cofactors. Type I systems cleave DNA at random locations up to several thousand basepairs away from their specific recognition site. The type III systems generally recognize an asymmetric DNA sequence and cleave at a specific position 20 to 30 basepairs to one side of the recognition sequence. Such systems require the cofactor ATP in addition to SAM and a divalent cation. The type III systems assemble a complex of endonuclease polypeptide and modification polypeptide that either modifies the DNA at the recognition sequence or cleaves. Type III systems produce partial digestion of the DNA substrate due to this competition between their modification and cleavage activities, and so have not been useful for genetic manipulation.
CstMI can be classified as a type II endonuclease in that it does not require ATP for DNA cleavage activity. Unlike other type II enzymes, however, CstMI consists of a single polypeptide that combines both endonuclease and modification activities and is sufficient by itself to form the entire restriction modification system. CstMI, like the related endonuclease MmeI, cleaves the farthest distance from the specific DNA recognition sequence of any type II endonuclease. CstMI is quite large and appears to have three functional domains combined in one polypeptide. These consist of an amino-terminal DNA cleavage domain which may also be involved in DNA recognition, a DNA modification domain most similar to the gamma-class N6 mA methyltransferases, and a carboxy-terminal domain presumed to be involved in dimer formation and possibly DNA recognition. The enzyme requires SAM for both cleavage and modification activity. The single CstMI polypeptide is sufficient to modify the plasmid vector carrying the gene in vivo to provide protection against CstMI cleavage in vitro, yet it is also able to cleave unmodified DNAs in vitro when using the endonuclease buffer containing Mg++ and SAM.
There is a continuing need for novel type II restriction endonucleases. Although type II restriction endonucleases which recognize a number of specific nucleotide sequences are currently available, new restriction endonucleases which recognize novel sequences provide greater opportunities and ability for genetic manipulation. Each new unique endonuclease enables scientists to precisely cleave DNA at new positions within the DNA molecule, with all the opportunities this offers.