The identification of early genetic changes in tumorigenesis is a primary focus in molecular cancer research. Characterization of the nature and pattern of cancer-associated genetic alterations will allow for early detection, diagnosis and treatment of cancer. Such genetic alterations in vertebrates fall generally into one of three categories: gain or loss of genetic material; mutation of genetic material; or methylation at cytosine residues in CpG dinucleotides within “CpG islands.”Among these, DNA methylation is unique in that it is a mechanism for modifying the base sequence of DNA without altering its coding, and because it is a heritable reversible epigenetic change. Changes in methylation state are also known to affect gene expression (e.g., transcriptional initiation of genes where CpG islands located at or near the promoter region) or genomic stability. DNA methylation plays a role in gene inactivation, cell differentiation, tumorigenesis, X-chromosome inactivation, genomic imprinting, and is required for mammalian development (Li, et al., Cell 69:915-926, 1992; Okano et al., Cell 99:247-57, 1999).
DNA methylation in higher-order eukayotic organisms. In higher order eukaryotic organisms, DNA is methylated only at cytosines located 5′ to guanosine in the CpG dinucleotide. This modification has important regulatory effects on gene expression predominantly when it involves CpG rich areas (CpG islands) located in the promoter region of a gene sequence. Gene silencing through DNA methylation has been shown to be a major transcriptional regulatory mechanism in mammalian, plant and fungal systems (Colot and Rossignol, Bioessays 21:402-1, 1999). Hypermethylation of promotor regions on DNA have been correlated with the progression of cancer (Jones & Laird, Nat. Genet. 21:163-7, 1999) and the etiology of aging (Ahuja et al., Cancer Res. 58:5489-94, 1998). Extensive methylation of CpG islands has been associated with transcriptional inactivation of selected imprinted genes and genes on the inactive X chromosome of females. Aberrant methylation of normally unmethylated CpG islands has been described as a frequent event in immortalized and transformed cells and has been frequently associated with transcriptional inactivation of tumor suppressor genes in human cancers.
The exact mechanisms of DNA methylation and demethylation have not been determined, although recently discovered methyltransferases, demethylases and methyl-CpG binding proteins (Amir et al., Nat. Genet. 23:185-8, 1999; Okano et al., Cell 99:247-57, 1999) will increase understanding of these processes. These DNA binding proteins and enzymes thus use 5-methylcytosine in DNA as a key recognition signal to mediate transcriptional regulation. DNA cytosine methylation is a post-replicative process catalyzed by DNA methyltransferases whereas demethylation or removal of 5-methylcytosine from DNA occurs most likely through the action of specific DNA glycolsylases.
DNA methyltransferases. Mammalian cells possess methylases that methylate cytosine residues on DNA that are 5′ neighbors of guanine in CpG dinucleotides (CpG). Methylation occurs after cytosine has been incorporated into DNA in a process catalyzed by DNA methyltransferases (“Dnmts”) which transfer the methyl group from S-adenosylmethionine to the 5′-position of the pyrimidine ring in, characteristically but not exclusively, the context of the palindromic CpG dinucleotide (Ramsahoye et al., Proc Natl Acad Sci USA. 97:5237-42, 2000). 5-Methylcytosine is asymmetrically distributed in the genome and is most commonly found in CpG-poor regions, since most CpG islands in somatic cells remain methylation-free, except for the promoters of imprinted genes and genes on the inactive X-chromosome (Bird et al., Cell 40:91-99, 1985) where methylation of 5′ regulatory regions can lead to transcriptional repression.
Three Dnmt enzymes are known in mouse and human, and these have overlapping yet distinct abilities to methylate “hemimethylated” and completely unmethylated CpG dinucleotide pairs (i.e., “maintenance” and “de novo” methylation, respectively). Hemi-methylation is defined as a state in which the two opposing cytosines on either DNA strand in a single palindromic CpG dinucleotide differ in that one is methylated at the C-5 position, and the other is not.
The predominant Dnmt in the cell, Dnmt1, was cloned and characterized by Bestor and colleagues (Bestor et al., J. Mol. Biol. 203:971-83, 1988; Bestor, Gene, 74:9-12, 1988) and is localized to replication machines in the S-phase nucleus (Leonhardt et al., Cell 71:865-73, 1992; Rountree et el., Nat. Genet 25:269-77, 2000). Since Dnmt1 shows a preference for hemimethylated CpG pairs (Gruenbaum et al., FEBS Lett. 124:67-71, 1981; Bestor and Ingram, Proc Natl Acad Sci USA. 80:5559-63, 1983), it is considered to be an excellent candidate for copying the pattern of methylation present on the parental strand after DNA has been replicated (i.e, “maintenance” methylation). However, Dnmt1 is capable of modifying unmethylated DNA in the test tube, and is thus also a candidate for inducing de novo methylation. The recently discovered Dnmts, Dnmt3a and 3b (Okano et al., Nucleic Acids Res. 26:2536-40, 1998) show equal activities in vitro for unmethylated and hemimethylated substrates, and have been shown to be capable of de novo methylation of transfected DNA in culture (Hsieh, Mol Cell Biol. 19:8211-8, 1999) and in Drosophila (Lyko et al., Nat. Genet. 23:363-6, 1999). Interestingly, satellite DNAs appear to be a preferred target for the human DNMT3B enzyme, because these satellite DNA sequences are specifically undermethylated in patients with ICF syndrome, characterized by germ-line mutations in the DNMT3B gene (Hansen et al., Proc Natl Acad Sci USA. 96:14412-7, 1999; Okano et al., Cell 99:247-57, 1999; Xu et al., Nature 402:187-91., 1999).
DNA glycosylases. Base excision repair (BER) occurs in vivo to repair DNA base damage involving relatively minor disturbances in the helical DNA structure, such as deaminated, oxidized, alkylated or absent bases. Numerous DNA glycosylases are known in the art, and function in vivo during BER to release damaged or modified bases by cleavage of the glycosidic bond linking such bases to the sugar-phosphate backbone of DNA (see Memisoglu & Samson, Mutation Research 451:39-51, 2000). All DNA glycosylases cleave gylcosidic bonds, but differ in their base substrate specificity and in their reaction mechanisms. Moreover a subset of DNA glycosylases possess an additional apurinic/apyrimidinic (AP) lyase activity, and one DNA glycosylsase (Ogg1) has an associated DNA deoxyribophosphatase acitivity (Sandigursky et al., Nucleic Acids Res. 25:4557-4561, 1997).
The recently described enzyme 5-methylcytosine DNA glycosylase (5-MCDG) provides a potential mechanism for demethylation of methylcytosine residues in DNA. Specifically, 5-MCDG acts by cleaving glycosylic bonds at methylated CpG sites of DNA, removing 5-methylcytosine (5-MeC) from the DNA backbone as a free base (Wolffe et al., Proc. Nat. Acad. Sci. USA 96:5894-5896, 1999).
Two types of 5-MCDG enzymes have been described. One type, found in both humans and chicken, comprises bi-functional enzymes having both G/T mismatch as well as 5-MCDG activity (Zhu et al., Proc. Natl. Acad. Sci. USA 98:5031-6, 2001; Zhu et al., Nuc. Acid Res. 28:4157-4165, 2000; and Nedderrnann et al., J.B.C. 271:12767-74, 1996). The other type (substantially purified from human sources) corresponds to a mono-functional enzyme having only 5-MCDG activity (Vairapandi & Duker, Oncogene 13:933-938, 1996; Vairapandi et al., J. Cell. Biochem. 79:249-260, 2000).
The mono-functional human version of 5-methylcytosine DNA glycosylase cleaves DNA specifically at fully methylated CpG sites, and is inactive on hemimethylated DNA (Vairapandi & Duker, supra; Vairapandi et al., supra), in contrast to the above-mentioned bi-functional enzymes. A recombinant version of the bi-functional chick embryo 5-methylcytosine-DNA glycolsylase has a greater activity for hemimethylated DNA than for fully methylated DNA, but its relative activity may be potentiated by the addition of recombinant CpG-rich RNA, ATP and the enzyme RNA helicase (Zhu et al., supra).
The mono-functional human 5-methylcytosine DNA glycosylase activity is associated with such accessory factors as the nuclear protein, proliferating cell nuclear antigen (PCNA) (Vairpandi et al). The DNA glycosylase activity may require an RNA component for full enzyme activity, however the activity is apparently insensitive to RNAse treatment (Vairpandi et al; Swisher et al., Nuc. Acid Res. 26:5573-5580, 1998).
Limitations of the Art. Changes in global levels of methylation and regional changes in patterns of methylation (e.g., CpG islands), are among the earliest and most frequently observed events known in many human cancers. For this reason, the activity of DNA methylases, and knowledge of methylation patterns can provide an early screen for cancer detection.
There are various art-recognized assays for assessing the methylation state at particular CpG sequences, once the sequence region comprising them has been identified so that specific primers and/or probes can be constructed. Such assays include: DNA sequencing methods; Southern blotting methods; MethyLight™ (fluorescence-based real-time PCR technique described by Eads et al., Cancer Res. 59:2302-2306, 1999; U.S. Pat. No. 6,331,393); MS-SNuPE (Methylation-sensitive Single Nucleotide Primer Extension assay described by Gonzalgo & Jones, Nucleic Acids Res. 25:2529-2531, 1997; U.S. Pat. No. 6,251,594); MSP (Methylation-specific PCR assay described by Herman et al. Proc. Natl. Acad. Sci. USA 93:9821-9826, 1996; U.S. Pat. No. 5,786,146); and COBRA (Combined Bisulfite Restriction Analysis methylation assay described by Xiong & Laird, Nucleic Acids Res. 25:2532-2534, 1997). Such methylation assays are used, for example, to analyze genomic DNA sequence regions that exhibit altered methylation patterns (hypermethylation or hypomethylation) in cancer patients. These methylation-altered DNA sequences are, in turn, useful in indirect therapeutic applications as diagnostic, prognostic and therapeutic markers for human cancer.
Assays for the discovery of novel differentially methylated CpG sequences are less numerous in the art, but include such methods as: restriction landmark genomic scanning (“RLGS”; Eng et al., Nature Genetics 25:101-102, 2000; Costello et al., Nature Genetics 25:132-138, 2000; Zhu et al., Proc. Natl. Acad. Sci. USA 96:8058-8063, 1999); methylated CpG island amplification (“MCA”; Toyota et al., Cancer Res. 59:2307-2312, 1999; WO 00/26401A1), differential methylation hybridization (“DMH”; Yan et al., Clin. Canc. Res. 6:1432-1438, 2000); arbitrarily primed-polymerase chain reaction (“AP-PCR”; Liang et al., Genomics 53:260-268, 1998); and RLGS in combination with virtual genome scans (“VGS”; Rouillard et al., Genome Research 11:1453-1459, 2001) derived from the sequence of the human genome to predict sequence of RLGS fragments (spots).
Restriction Landmark Genomic Scanning. For example, restriction landmark genomic scanning (“RLGS”) approaches have been employed to identify sequences and regions of differential methylation, and regions so-identified have been cloned and sequenced. RLGS methods take advantage of the fact that specific DNA cleavage by particular restriction enyzmes, such as NotI is methylation sensitive. Moreover, NotI has a CG-rich octanucleotide recognition motif, and cleaves predominantly in CpG-rich “islands.” Thus, digestion of genomic DNA with NotI and end-labeling of the NotI staggered ends, followed by further restriction digestion (e.g., with 5-base and/or 6-base recognition sequence enzymes) in combination with 2-dimensional electrophoresis has been used to generate resolved patterns of CpG-island-related fragments having at least one labeled NotI end. Such patterns can be used to compare the methylation status among various genomic DNA samples, and if a particular NotI site is methylated in a test genomic DNA sample, relative to that in normal genomic DNA, no corresponding end labeled fragment(s) will be visible in the RLGS pattern of the test sample (corresponding ‘spot disappearance,’ or absence). Boundary libraries (e.g., of NotI-EcoRV fragments) can then be used to obtain cloned DNA corresponding to such regions.
Significantly, however, such prior art RLGS methods for detection of CpG methylation are limited, inter alia, by: (i) the use of only particular methylation-sensitive restriction enzymes, which effectively limits analyses to CpG sequences within CpG island regions; (ii) dependence (for detection) upon NotI end-labeling (or the equivalent); and (iii) upon the disappearance of (more accurately, the absence of) a test DNA spot (i.e., where a particular NotI site in a test DNA sample is methylated and therefore not cleaved by NotI digestion) relative to a corresponding spot present in the normal (test) DNA 2-dimensional pattern. Moreover the current boundary libraries have ‘holes,’ because the EcoRV-EcoRV fragments are excluded.
Virtual Genome Scans. Virtual genome scans (VGS) provide methods for use in conjunction with RGLS methods to identify fragments of interest displayed in RLGS scans. Informatics tools are used, in conjunction with known human genome sequence information, to produce virtual scans, for example, with NotI and EcoRV (as first-dimension RLGS restriction enzymes), and, for example, HinfI or DpnII (as second-dimension enzymes). The size of the expected NotI-EcoRV and NotI-NotI fragments (if no intervening EcoRV site is present) are computed, along with the second-dimension fragments, based on the HinfI or DpnII site nearest to a particular NotI site (Rouillard et al. Genome Research 11:1453-1459, 2001). Thus, identification of RLGS sequences can be made without the use of boundary libraries, and is therefore not subject to the EcoRV-EcoRV ‘holes’ present in such libraries.
However, the method still depends on determining the differences between two samples using RLGS, and is thus is subject most of the limitations thereof.
Methylated CpG Island Amplification. Methylated CpG island amplification (“MCA”) is a PCR-based technique for rapid enrichment of hypermethylated CG-rich regions, that requires the sequential digestion by a particular methylation sensitive, methylation insensitive isoschizomeric enzyme pair (i.e., SmaI and XmaI, respectively), followed by PCR amplification based on primers that specifically hybridize to adapters ligated to the staggered XmaI ends. Additionally, the restriction sites must be closely situated (<1 kb apart). Thus, as in the case of prior art RLGS applications, the method is primarily limited to particular CpG sequences within CpG-rich genomic regions (Toyota et al., Cancer Res. 59:2307-2312, 1999). Moreover, and the technique is sensitive to artifacts relating to incomplete digestion with SmaI, the methylation sensitive restriction enzyme. The technique can be combined, in a more complex multistep method with substractive hybridization (RDA; representational difference analysis) to obtain cloned fragments enriched for hypermethylated sequences (Id).
Methylation-Sensitive Arbitrarily Primed PCR. Likewise, methylation-sensitive arbitrarily primed-polymerase chain reaction (“AP-PCR”) is a PCR-based technique for rapid enrichment of hypermethylated CG-rich regions, that involves co-digestion of DNA with a methylation-insensitive enzyme (e.g., RsaI) to generally reduce the size of DNA fragments, plus, in separate reactions, a methylation-sensitive member, and a methylation-insensitive member of a isoschizomeric enzyme pair (e.g., RsaI plus HpaII, and RsaI plus MspI, respectively), followed by PCR amplification using one or more specific oligonucleotide primers. In this case, no PCR products are produced if the region between two primer sites contains an unmethylated HpaII (CCGG) sequence. Digestion of the DNA with RsaI only, and with RsaI and MspI serve as controls for determining whether bands observed in the AP-PCR of RsaI- plus HpaII-digested DNA are actually due to differential methylation of CCGG sequences within the region of amplification (Gonzalgo et al., Cancer Research 57:594-599).
Thus, methylation-sensitive AP-PCR methods, are limited commensurate with primer choice, and as for RLGS and MCA described above, are primarily biased toward CpG island regions, especially when extensively CG-rich primer sequences are employed (Liang et al., Genomics 53:260-268, 1998). Generally, methylation-sensitive AP-PCR is subject to many of the same artifacts that limit the effectiveness of MCA methods, such as incomplete digestion by restriction enzymes, and distance between primer sites.
Differential Methylation Hybridization. Differential methylation hybridization (“DMH”) is a micro array-based method involving differential probing of arrayed CG-rich tags (from a CpG island genomic library) with amplicons from reference, or, e.g., tumor DNA samples. The differences in tumor and reference signal intensities on the tested CpG island arrays reflect methylation alterations of corresponding sequences in the tumor DNA (Yan et al., Clin. Canc. Res. 6:1432-1438, 2000).
To produce amplicons, the DNA is digested to produce small (<200 bp) DNA fragments while preserving CpG islands (e.g., by digestion with MseI, recognizing TTAA). Linkers are ligated to the fragment ends, and the fragments are digested with a methylation-sensitive enzyme, e.g., BstUI (77% of known CpG islands contain BstUI sites), prior to filling in the protruding linker ends and PCR amplification using linker primers. Fragments cleaved by the methylation-sensitive enzyme are rendered non-amplifiable by the linker primers, so that the amplified fragment pool is enriched for methylated amplicons.
However, the method is limited to CpG-rich islands, and at least currently, is further limited by the fact that only about 2% of the total genomic CpG island regions are represented in the available arrayed panels (Id).
Whereas RLGS and other prior art assays to identify differentially methylated CpG sequences have great potential there is a need in the art for additional methods not only to validate the number of genes with hypermethylated promoters in neoplasia and other diseases, but also to determine the number that are relevant to tumorigenesis or other aberrant cell functions. For example, many promoters, including those critical to cancer biology and inactivated through hypermethylation, do not contain CpG islands.
Therefore, there is a need in the art for novel methods to identify all novel differentially methylated CpG dinucleotide sequences, where the methods are neither limited to methylation analyses within CpG-rich genomic regions (as is primarily the case for RLGS, AP-PCR, MCA, and DMH applications), nor limited to methylation analyses of CpG dinucleotide sequences within particular restriction enzymes recognition motifs. Additionally, there is a need in the art for methods which provide for positive detection of methylated genomic DNA fragments based on specific labeling of methylated CpG sequences, as opposed to methods based on differential digestion by a methylation-sensitive restriction enzyme followed by indirect or negative detection, based on labeling of restriction enzyme generated ends and identification by virtue of the absence of labeling (as in RLGS methods). Additionally, there is a need in the art to identify those CpG dinucleotide sequences that are potentially methylatable, either at the level of isolated genomic DNA, or at the cellular level in the context of particular cellular physiologies.