Throughout this application various publications are referred to in parenthesis. Full citations for these references may be found at the end of the specification immediately preceding the claims. The disclosures of these publications are hereby incorporated by reference in their entireties into the subject application to more fully describe the art to which the subject application pertains.
Methylation of cytosine in DNA is a major component of epigenetic regulation of gene expression. Cytosine methylation is important for normal growth and development (Dean et al., 2005; Monk et al., 1987) and is a major source of gene expression abnormalities in cancer (Fiegl and Elmasry, 2007; Zhu and Yao, 2007). DNA cytosine methylation can be used as a biomarker for cancer detection (Belinsky, 2004; Gonzalgo et al., 2007; Jubb et al., 2003; Zhu and Yao, 2007). Cytosine methylation may also play a role in regulating the induction of synaptic plasticity in the mature central nervous system (Levenson et al., 2006).
Various approaches have been used to analyze cytosine methylation (Ching et al. 2005; Hu et al. 2005; Khulan et al. 2006; Laird 2003; Ushijima 2005; Weber et al. 2005). Many of the techniques used to test cytosine methylation at multiple loci are not suitable for comparing methylation levels at different loci within a genome. There is a need for a platform for intragenomic profiling that will permit integrating studies of cytosine methylation with other whole-genome studies of epigenetic regulation.
The use of restriction enzymes that are sensitive to cytosine methylation has allowed many of the early insights into the distribution of methylated CpG dinucleotides in the mammalian genome. For example, the use of HpaII revealed that most of the genome remains high molecular weight following digestion despite the short recognition motif (5′-CCGG-3′) at which the enzyme cuts (Singer et al. 1979). It was subsequently recognized that between 55 and 70% of HpaII sites in animal genomes are methylated at the central cytosine (Bestor et al. 1984; Bird 1980), which is part of a CpG dinucleotide. The minority of genomic DNA that cuts to a size of hundreds of basepairs was defined as HpaII Tiny Fragments (HTFs) (Bird 1986), revealing a population of sites in the genome at which two HpaII sites are close to each other and both unmethylated on the same DNA molecule. Cloning and sequencing of these HTFs revealed them to be (G+C) and CpG dinucleotide-rich, allowing base compositional criteria to be created to predict presumably hypomethylated CpG islands (Gardiner-Garden and Frommer 1987). Genome sequencing project data have revealed that fewer than 12% of HpaII sites in the human genome (and fewer than 9% in mouse) are located within annotated CpG islands (Fazzari and Greally 2004). This raised the question whether a substantial proportion of HTFs is, in fact, derived from non-CpG island sequences and could be used to examine many non-CpG island sites in the genome for cytosine methylation status.
Most restriction enzyme-based or affinity-based techniques are designed to identify enriched methylated DNA regions in the genome. As most CG dinucleotides of animal genomes are methylated (Gruenbaum et al., 1981; Kunnath and Locker, 1982), including most transposable elements (Yoder et al., 1997), these approaches enrich the majority of the genome and repetitive sequences rather than the hypomethylated minority of unique sequences that tend to be located at functionally-interesting sites. The presence of a hypomethylated site in an assay that enriches methylated DNA has to be inferred from the absence of signal, which can also occur due to technical problems or base compositional reasons.
The approach described in the present application allows the positive identification of hypomethylated loci and is robust for analysis of CG dinucleotide-enriched regions of the genome where restriction enzyme digestion can create short DNA fragments that can pose problems for analysis.