1. Field of the Invention
The present invention relates generally to methylation of genomic DNA and more specifically to the identification of sequences normally methylated in the genome and their relationship to disease states.
2. Background Information
DNA methylation is central to many mammalian processes including embryonic development, X-inactivation, genomic imprinting, regulation of gene expression, and host defense against parasitic sequences, as well as abnormal processes such as carcinogenesis, fragile site expression, and cytosine to thymine transition mutations. DNA methylation in mammals is achieved by the transfer of a methyl group from S-adenosyl-methionine to the C5 position of cytosine. This reaction is catalyzed by DNA methyltransferases and is specific to cytosines in CpG dinucleotides. Seventy percent of all cytosines in CpG dinucleotides in the human genome are methylated and prone to deamination, resulting in a cytosine to thymine transition. This process leads to an overall reduction in the frequency of guanine and cytosine to about 40% of all nucleotides and a further reduction in the frequency of CpG dinucleotides to about a quarter of their expected frequency (Bird 1986).
The exception to CpG under representation in the genome is CpG islands, which were first identified as Hpa II tiny fragments (Bird et al. 1985), and were later formally defined as sequences >200 bp in length, with a GC content >0.5, and a CpGobs/CpGexp (observed to expected ratio based on GC content) >0.6 (Gardiner-Garden and Frommer 1987). CpG islands have been estimated to constitute 1%-2% of the mammalian genome (Antequera and Bird 1993), and are found in the promoters of all housekeeping genes, as well as in a less conserved position in 40% of genes showing tissue-specific expression (Larsen et al. 1992). The persistence of CpG dinucleotides in CpG islands is largely attributed to a general lack of methylation of CpG islands, regardless of expression status (reviewed in Cross and Bird 1995).
Although CpG islands are believed to be unmethylated, two exceptions to this rule in normal cells are the inactive X chromosome (Yen et al. 1984) and imprinted genes (Ferguson-Smith et al. 1993; Razin and Cedar 1994; Barlow 1995), both of which are associated with methylated CpG islands. Genomic imprinting is the parental origin-specific differential expression of the two alleles of a gene, and most imprinted genes show differential germline methylation of associated CpG islands (reviewed in Ohlsson et al. 2001). A third exception to the rule of methylation exclusion of CpG islands is aberrant methylation of CpG islands in tumors and in immortalized cultured cells, and such CpG island methylation is thought to contribute to carcinogenesis (Herman et al. 1994; Merlo et al. 1995).
Because of the interest in DNA methylation, genomic imprinting, and cancer, several general approaches have been used to identify CpG islands that are differentially methylated in specific cell types, such as screening tumor-normal pairs for cancer-related methylation changes (Huang et al. 1999; Shiraishi et al. 1999; Toyota et al. 1999), or pronuclear transplantation to examine differential parental origin for imprinted genes (Hayashizaki et. 1994; Plass et al. 1996). However, there are no reports of successfully using a systemic effort to identify unique, methylated CpG islands.
There are a variety of genome scanning methods that have been used to identify altered methylation sites in cancer cells. For example, one method involves restriction landmark genomic scanning (Kawai et al., Mol. Cell. Biol. 14:7421-7427, 1994), and another example involves methylation-sensitive arbitrarily primed PCR (Gonzalgo et al., Cancer Res. 57:594-599, 1997). Changes in methylation patterns at specific CpG sites have been monitored by digestion of genomic DNA with methylation-sensitive restriction enzymes followed by Southern analysis of the regions of interest. The digestion-Southern method is a straightforward method but it has inherent disadvantages in that it requires a large amount of DNA (at least or greater than 5 ug) and has a limited scope for analysis of CpG sites (as determined by the presence of recognition sites for methylation-sensitive restriction enzymes). Another method for analyzing changes in methylation patterns involves a PCR-based process that involves digestion of genomic DNA with methylation-sensitive restriction enzymes prior to PCR amplification (Singer-Sam et al., Nucl. Acids Res. 18:687, 1990). However, this method has not been shown effective because of a high degree of false positive signals (methylation present) due to inefficient enzyme digestion of overamplification in a subsequent PCR reaction.
Genomic sequencing has been simplified for analysis of DNA methylation patterns and 5-methylcytosine distribution by using bisulfite treatment (Frommer et al., Proc. Natl. Acad. Sci. USA 89:1827-1831, 1992). Bisulfite treatment of DNA distinguishes methylated from unmethylated cytosines, but original bisulfite genomic sequencing requires large-scale sequencing of multiple plasmid clones to determine overall methylation patterns, which prevents this technique from being commercially useful for determining methylation patterns in any type of a routine diagnostic assay.
In addition, other techniques have been reported which utilize bisulfite treatment of DNA as a starting point for methylation analysis. These include methylation-specific PCR (MSP) (Herman et al. Proc. Natl. Acad. Sci. USA 93:9821-9826, 1992); and restriction enzyme digestion of PCR products amplified from bisulfite-converted DNA (Sadri and Hornsby, Nucl. Acids Res. 24:5058-5059, 1996; and Xiong and Laird, Nucl. Acids. Res. 25:2532-2534, 1997).
PCR techniques have been developed for detection of gene mutations (Kuppuswamy et al., Proc. Natl. Acad. Sci. USA 88:1143-1147, 1991) and quantitation of allelic-specific expression (Szabo and Mann, Genes Dev. 9:3097-3108, 1995; and Singer-Sam et al., PCR Methods Appl. 1:160-163, 1992). Such techniques use internal primers, which anneal to a PCR-generated template and terminate immediately 5′ of the single nucleotide to be assayed. However an allelic-specific expression technique has not been tried within the context of assaying for DNA methylation patterns.
Therefore, there remains a need for a method for using a systemic or genome-wide approach to identify unique, methylated CpG islands, GC rich regions and CpG dinucleotides, including normally methylated CpG sequences.