Methylation of cytosine to form 5-methylcytosine occurs commonly in eukaryotic genomes. In mammalian cells, methylation of cytosine in CpG dinucleotide sequences (5-Me-CpG, 5mCpG, meCpG, or mCpG) plays a role in regulation of chromatin stability, gene regulation, parental imprinting and X-chromosome inactivation. Cytosine methylation is recognized as one of the most important epigenetic modifications in cells that affects a surprising number of physiological states. In particular it has been recognized that hypermethylation of cytosine in CpG-rich sequences located in promoter and/or other regulatory regions of genes (CpG islands) has a tremendous effect on gene expression. For example, hypermethylation of such regulatory regions has been correlated with cancer.
Esteller, et al. (Cancer Res. 2001:61, 3225-3229) have shown that specific hypermethylation profiles among several genes can be correlated to specific types of cancer. For example, hematological malignancies (lymphomas and leukemias) show hypermethylation of p73 and p15INK4b, which hypermethylation is absent from many solid tumor tissues. Further, hypermethylation of the mismatch repair gene hMLH1 is restricted to tumor types characteristic of hereditary nonpolyposis colorectal cancer syndrome: colorectal, endometrial and gastric tumors with microsatellite instability.
Herman and Baylin (N.E.J. Med. 2003:349, 2042-2054) have reviewed gene silencing in cancer in relation to promoter hypermethylation. One of the specific points made in the review is the possibility that hypermethylation of specific gene regions can be used to identify very early stages in developing cancers. It has long been recognized for a variety of cancers that early detection provides significant long-term survival benefits. In modern medicine, considerable medical and basic scientific expertise has been and continues to be applied to the development of techniques, procedures and diagnostic assays that can be performed on specimens, particularly those that can be obtained non-invasively or in a minimally invasive manner, to identify at-risk individuals and to begin prevention/treatment evaluation and action. For example, sputum specimens can be used to identify aberrant methylation of specific genes in various stages of lung cancer (Belinsky et al. Proc. Natl. Acad. Sci. USA (1998) 95:11891-11896; Palmisano et al. Caner Res. (2000) 60:5954-5958). Similarly, prior to symptom onset, urine specimens could be used to provide cells indicating hypermethylation of genes that have been identified in association with bladder cancer. Likewise, examination of hypermethylation patterns in tissues obtained through minimally invasive procedures such as vaginal and cervical scrapings, gastric lavage specimens, and colon tissue could lead to early detection assays for vaginal, cervical and/or uterine cancer, esophageal or stomach cancers and colorectal cancers.
An ever increasing variety and sophistication of methods for detection of aberrant mCpG patterns are being applied to analysis of cytosine methylation. Several methods for analyzing DNA methylation patterns on a genome-wide scale have been developed, yet none have achieved wide spread acceptance. Rather than analyzing genome wide methylation patterns, many studies focus on specific gene regulatory regions, utilizing arrays carrying specific sequences which have previously been recognized as being related to the specific physiological condition being studied. Methods in general use combine the variety of approaches that have been developed over the years.
An early method relied on the use of methylation-sensitive and -insensitive restriction endonucleases. After cutting of DNAs with a pair of isoschizomers, Southern blotting or amplification analysis or a combination are used to assess methylation status of particular genetic loci. Other methods make use of modification or elimination of cytosine versus methyl-cytosine bases in DNA to assess methylation patterns (e.g., Tanaka et al., J. Am. Chem. Soc. 2007:129, 5612-5620; Frommer, et al., Proc. Natl. Acad. Sci. USA 1992:89, 1827-1831; Clark, et al. Nuc. Acids Res., 1994:22, 2900-2997). Some methods, such as “combined bisulfite restriction analysis” (COBRA) combine elimination of cytosine with restriction enzyme analysis (Xiong and Laird, Nuc. Acids Res. 1997:25, 2532-2534) to assess methylation status. Other methods combine cytosine modification/elimination with amplification analysis making use of specific amplification primers to match the variety of sequences that arise from bisulfite modification of cytosine residues, versus non-modification of methyl-cytosine (e.g., methylation-specific PCR (MSP), Herman et al., Proc. Natl. Acad. Sci. USA 1996:93, 9821-9826).
In addition, various methyl-cytosine binding ligands have been used to isolate mCpG-containing sequences. Such ligands include an antibody that requires single stranded DNA in order to bind mCpG sequences (Keshet et al. Nat. Genet. 2006:38, 149-153; Weber et al. Nat. Genet. 2005:37, 853-862); a methylated DNA binding protein domain (MBD) of MeCP2 (Cross et al. Nat. Genet. 1994:6, 236-244); a bivalent antibody-like construct of methyl-CpG binding domain protein 2 (MBD2) and human-Fc (Gebhard, et al. Cancer Res. 2006:66, 6118-6128 and Gebhard, et al. Nuc. Acids Res. 2006:34, e82) and a complex of MBD2/MBD3L1 (Rauch, et al. Cancer Res. 2006:66, 7939-7947).
Kits making use of methyl-CpG binding proteins are now available for purchase from various sources (e.g., “MethylCollector”™ from Active Motif, Inc., Carlsbad, Calif.), as are kits for preparing bisulfite-treated DNA, a procedure which converts cytosine residues to uracil (e.g., EpiTect Bisulfite Kit from Epigenomics AG, Berlin, Germany), kits for Differential Methylation Hybridization (DMH-Epigenomics AG, Berlin), kits for Promoter Methylation Array and Methylation Promoter Polymerase Chain Reaction (PCR) (Panomics, Inc., Fremont, Calif.) and kits for “MethylLight” TaqMan® assays (EpiTect Quantitative MethylLight, Epigenomics AG, Berlin).
In the approaches utilizing proteins to enrich the fraction of sequences containing methyl-CpG sequences the affinity and specificity of the binding proteins are of considerable importance. If one wished to isolate such sequences from specimens in which the tumor or tumor progenitor cells represent a very small fraction of the total specimen, the affinity of the reagents is particularly important.
Methods to enrich the methyl-CpG sequences of a sample could simplify and enhance the accuracy of assays that might benefit from the combination of modification/elimination of cytosine residues, restriction analysis and methyl binding proteins.
For example, some propose to analyze specimens for the presence and prevalence of particular CpG sequences that are retained after bisulfite treatment of DNA, for instance, analyzing all remaining CGCG sequences. In this example, any retained CGCG sequence would indicate that the original DNA was doubly methylated prior to treatment (-MeCGMeCG-). The bisulfite reacted and amplified product (-CGCG-) could be analyzed through the use of cleavage with BstUI and the results related to specific sequences associated with particular cancers.
In bisulfite treated, amplified DNA, any remaining cytosine residues were originally methyl-cytosine residues. In the example of analyzing for retained CGCG sequences, only a fraction of the remaining cytosines in the entire bisulfite treated sample were originally in the doubly methylated CGCG sequence. Thus the cytosine residues remaining in CGCG may represent such a very small fraction of the remaining cytosine residues in the specimen that simple analytical methods involving binding with CpG binding ligands could prove ineffective. Thus, examination of specific methyl-CpG sequences would benefit from a procedure capable of enriching for the particular sequence in its methylated state. Embodiments of the present invention provide for enrichment of methyl-CpG sequences, particularly from DNA samples that were first treated to remove all non-methylated cytosine residues.
Another issue important to developing means for early diagnostic assays related to methylation status is illustrated in part in the above discussion. If one wished to analyze methylation status with respect to particular arrangements of CpG sequences (i.e., the “context” of the mCpG sequence) the above situation illustrates the difficulties. In the CGCG tetranucleotide there are four potential mCpG methylation states, unmethylated-CpGpCpG, singly methylated-MeCpGpCpG or CpGpMeCpG, and doubly methylated-Me CpGpMeCpG. After bisulfite treatment, the four states would yield UGUG, CGUG, UGCG and CGCG. Sequence analysis would not be capable of distinguishing whether the first three of the four product sequences arose from CGCG or TGTG, MeCGCG or MeCGTG, and CGMeCG or TGMeCG, respectively. The only clear result would be the one resulting in CGCG, which could only result from MeCGMeCG.
This is also the case for the sequences having single mCpG methylation sites, but for which context is lost through bisulfite treatment and amplification. For example, CCGG, which has a single mCpG methylation site (CMeCGG), which, after conversion of cytosines (e.g., by bisulfite treatment) and amplification is no longer CCGG, but would be TCGG. Thus, after bisulfite treatment and analysis, without methods of embodiments of the present invention, it would be difficult and likely impossible to know whether the sequence was originally TMeCGG or was CMeCGG. Other examples that would benefit from context determinative methods of the invention include any sequence that is CG rich that includes the mCpG dinucleotide.
An additional improvement in current methods would be afforded by providing additional mCpG-binding ligands. One aspect of the present invention provides an isolated, purified recombinant McrA protein (rMcrA) for specific binding of mCpG sequences found in the context of both CMeCGG and MeCMeCGG. Clones and methods for expression, isolation and use of the rMcrA protein are disclosed herein.