The genomes of higher eukaryotes contain the modified nucleoside 5-methyl cytosine (5-meC). This modification is usually found as part of the dinucleotide CpG. Cytosine is converted to 5-methylcytosine in a reaction that involves flipping a target cytosine out of an intact double helix and transfer of a methyl group from S-adenosylmethionine by a methyltransferase enzyme (Klimasauskas et al., Cell 76:357-369, 1994). This enzymatic conversion is the only epigenetic modification of DNA known to exist in vertebrates and is essential for normal embryonic development (Bird, Cell 70:5-8, 1992; Laird and Jaenisch, Human Mol. Genet. 3:1487-1495, 1994; and Li et al., Cell 69:915-926, 1992).
The frequency of the CpG dinucleotide in the human genome is only about 20% of the statistically expected frequency, possibly because of spontaneous deamination of 5-meC to T (Schoreret et al., Proc. Natl. Acad Sci. USA 89:957-961, 1992). There are about 28 million CpG doublets in a haploid copy of the human genome and it is estimated that about 70-80% of the cytosines at CpGs are methylated. Regions where CpG is present at levels that are approximately the expected frequency are referred to as “CpG islands” (Bird, A. P., Nature 321:209-213, 1986). These regions have been estimated to comprise about 1% of vertebrate genomes and account for about 15% of the total number of CpG dinucleotides. CpG islands are typically between 0.2 and 1 kb in length and are often located upstream of housekeeping and tissue-specific genes. CpG islands are often located upstream of transcribed regions, but may also extend into transcribed regions. About 2-4% of cytosines are methylated and probably the majority of cytosines that are 5′ of Gs are methylated. Most of the randomly distributed CpGs are methylated, but only about 20% of the CpGs in CpG islands are methylated. Recent studies on CpG islands suggest that promoters segregate into two classes by CpG content. See, Saxonov et al., PNAS 103(5):1412-7 (2006).
DNA methylation is an epigenetic determinant of gene expression. Patterns of CpG methylation are heritable, tissue specific, and correlate with gene expression. The consequence of methylation is usually gene silencing. DNA methylation also correlates with other cellular processes including embryonic development, chromatin structure, genomic imprinting, somatic X-chromosome inactivation in females, inhibition of transcription and transposition of foreign DNA and timing of DNA replication. When a gene is highly methylated it is less likely to be expressed, possibly because CpG methylation prevents transcription factors from recognizing their cognate binding sites. Proteins that bind methylated DNA may also recruit histone deacetylase to condense adjacent chromatin. Such “closed” chromatin structures prevent binding of transcription factors. Thus the identification of sites in the genome containing 5-meC is important in understanding cell-type specific programs of gene expression and how gene expression profiles are altered during both normal development and diseases such as cancer. Precise mapping of DNA methylation patterns in CpG islands has become essential for understanding diverse biological processes such as the regulation of imprinted genes, X chromosome inactivation, and tumor suppressor gene silencing in human cancer caused by increase methylation.
Methylation of cytosine residues in DNA plays an important role in gene regulation. Methylation of cytosine may lead to decreased gene expression by, for example, disruption of local chromatin structure, inhibition of transcription factor-DNA binding, or by recruitment of proteins which interact specifically with methylated sequences and prevent transcription factor binding. DNA methylation is required for normal embryonic development and changes in methylation are often associated with disease. Genomic imprinting, X chromosome inactivation, chromatin modification, and silencing of endogenous retroviruses all depend on establishing and maintaining proper methylation patterns. Abnormal methylation is a hallmark of cancer cells and silencing of tumor suppressor genes is thought to contribute to carcinogenesis. Methylation mapping using microarray-based approaches may be used, for example, to profile cancer cells revealing a pattern of DNA methylation that may be used, for example, to diagnose a malignancy, predict treatment outcome or monitor progression of disease. Methylation in eukaryotes can also function to inhibit the activity of viruses and transposons, see Jones et al., EMBO J. 17:6385-6393 (1998). Alterations in the normal methylation process have also been shown to be associated with genomic instability (Lengauer et al., Proc. Natl. Acad. Sci. USA 94:2545-2550, 1997). Such abnormal epigenetic changes may be found in many types of cancer and can serve as potential markers for oncogenic transformation.