Various publications or patents are referenced in this application to describe the state of the art to which the invention pertains. Each of these publications or patents is incorporated by reference herein.
In vivo methylation of DNA has been used successfully to study protein-DNA interactions in the chromatin of living cells. A high frequency of methyltransferase targets is critical for high resolution mapping of chromatin structure. Among currently available methyltransferase probes, the only de novo dinucleotide methyltransferase is M.SssI, which recognizes a CpG site (Renbaum, P., Abrahamove, D., Fainsod, A., Wilson, G., Rottem, S. and Razin, A. (1990) Nucleic Acids Res., 18, 1145–1152). Due to under-representation of the CpG dinucleotide in the genome, the resolution of chromatin structure maps using this enzyme is about 35 base pairs on average in S. cerevisiae (Dujon, B., Alexandrakl, D., André, B., Ansorge, W., Baladron, V., Ballesta, J. P. G., Banrevl, A., Bolle, P. A., Bolotin-Fukuhara, M., Bossler, P. et al). (1994) Nature, 369, 371–378.). With this moderate level of resolution, M.SssI can possibly serve to detect the presence of a positioned nucleosome, 146 bp in yeast, without the need for introduction of additional CpG sites into native DNA sequences. However, this resolution is insufficient for mapping the interactions of non-histone regulatory proteins, since the typical length of the target DNA sequence of most regulatory proteins is ˜20–30 base pairs or less. For example, the yeast TATA box binding protein (TBP) recognizes and binds to an 8 bp sequence (Kim, Y., Geiger, J. H., Hahn, S. and Sigler, P. B. (1993) Nature, 365, 512–520.), while the well-characterized transcriptional activator Gal4p binds to a 17 bp consensus sequence (Giniger, E., Varnum, S. M. and Ptashne, M. (1985) Cell, 40, 767–774.). Furthermore, methylation of CpG islands has been implicated as an important controlling element for gene regulation in mammalian systems, which may limit the application of M.SssI in higher organisms (Tazi, J. and Bird, A. (1990) Cell, 60, 909–920.). To address both the limitation of resolution and the possible inability to utilize M.SssI in higher organisms, cloning and expression of cytosine-5-DNA methyltransferases (5-meC MTase) with different specificities but similarly small recognition sites is essential.
A family of double-stranded DNA viruses that infect certain unicellular, eukaryotic, Chlorella-like green algae are reported to be a rich source of restriction/modification systems (Nelson, M., Zhang, Y. and Van Etten, J. L. (1993) DNA Methylation: Molecular Biology and Biological Significance. Birkhauser-Verlag Press, Basel, Switzerland, pp. 186–211;Nelson, M., Burbank, D. E. and Van Etten, J. L. (1998) Biological Chem. 379, 423–428). Among the 37 viruses infecting Chlorella NC64A and the five viruses infecting Chlorella Pbi which have been partially characterized, 39 viral DNAs contain 5-methylcytosine, ranging in concentration from 0.1 to 47% of total cytosine (Nelson & Van Etten, 1993, supra; Nelson & Van Etten, 1998, supra).
One cytosine methyltransferase, M.CviJI, has been cloned from Chlorella virus IL-3A and shown to recognize the nucleotide sequence RGC(T/C/G) (Shields, S. L., Burbank, D. E., Grabherr, R. and Van Etten, J. L. (1990) Virology, 176, 16–24). As determined by the resistance/sensitivity of the viral DNAs to over 70 methylation-sensitive restriction endonucleases, at least five independent 5-meC modification systems are predicted to be encoded by some of the more highly modified viruses, including methyltransferases thought to recognize CpC and RpCpY (Nelson & Van Etten, 1993, supra; Nelson & Van Etten, 1998, supra). Based on the composition of the yeast genome as an example, on average, one CpC site per 13.9 bp and one RpCpY site per 10.7 bp can be expected in the genome. Achieving this level of resolution would allow mapping the interactions of most non-histone, regulatory proteins. The cloning of methyltransferases from Chlorella viruses could greatly extend the resolution of chromatin mapping as well as allow extension of in vivo chromatin mapping to higher organisms.