Methylation of cytosines located 5′ adjacent to guanosine is known to have a repressive effect on the expression of many eukaryotic genes (1-6).
Aberrant methylation of normally unmethylated CpG islands has been documented as a relatively frequent event in experimentally immortalized and transformed cells, and it has been clearly associated with transcriptional inactivation of defined tumor suppresser genes in human cancers (7, 8). Hundreds of CpG islands are now known to exhibit the characteristic of hypermethylation in tumor cells (9). Therefore, mapping of methylation patterns in CpG islands has become important for understanding both normal and pathologic gene expression events.
The most direct mechanism by which DNA methylation can interfere with transcription is to prevent the binding of basal transcriptional machinery or ubiquitous transcription factors that require contact with cytosine in the major groove of the double helix. Most mammaliam transcription factors have GC-rich binding sites and many have CpGs in their DNA recognition elements. Binding by several of these factors is impeded or abolished by methylation of CpG.
The highest density of nonmethylated CpGs in the vertebrate genome is found in CpG islands, which usually contain promoter or other regulatory DNA that is required for active transcription of a gene. CpG island chromatin is enriched in hyperacetylated histones and deficient in linker histones. These are important features of transcriptionally competent chromatin templates. In contrast, chromatin assembled on artificially methylated DNA becomes associated with hypoacetylated histones, refractory to nuclease or restriction endonuclease digestion and transcriptionally silent. Many tumor-suppressor and other cancer-related genes have been found to be hypermethylated in human cancer cells (48).
The different classes of genes that are silenced by DNA methyaltion include tumor-suppressor genes, genes that suppress tumor invasion, and metastasis; DNA repair genes; genes for hormone receptors; and genes that inhibit angiogenesis. Gene silencing by hypermethylation of genes has been recognized as an important mechanism of carcinogenesis that has great promise for cancer prevention and therapy. A first step is the identification of such genes or regulatory elements. The first described alteration in the retinoid pathway was the leukemogenic role of the PML-RAR fusion protein. Evidence has since been obtained that supports the role of RARβ2 as a tumor-suppressor gene, including the role in the induction of RARβ2 related to the chemopreventive effects of retinoids, the loss of RARβ2 expression in human neoplasms. Frequent chromosomal losses at 3P21-3p24 where RARβ2 is located, and the mehtylation-mediated silencing of RARβ2. The silencing of the cellular retinoid-binding protein-1 gene (CRBP1) was reported as a common alteration in human cancer (48).
The cytochromes P450 are important phase I bioactivating carcinogen metabolism enzymes, and have been hypothesized to be responsible, in part, for inter-individual differences in susceptibility to chemically-induced disease (10-15); CYP1B1 is among the most highly expressed P450 enzyme in human lung and human breast, and it bioactivates both polyaromatic hydrocarbons and estradiol to highly mutagenic species.
Glutathione-S-transferases (GSTs) are phase II deactivating enzymes critically involved in DNA protection from electrophilic metabolites of carcinogens and reactive oxygen, nitrogen, lipid species, and chemotherapeutic agents. GSTP1 is the most highly expressed GST in the human lung and upper airway (16,17). Observational studies on normal tissue expression patterns for both CYP1B1 and GSTP1 gene products suggest inter-individual variation over several orders of magnitude, not explained by measured environmental exposures (16-18). Variation in regulatory-region features, including promoter genetic polymorphisms, transcription factor levels, and epigenetic features are hypothesized to vary across individuals. To our knowledge, no detailed survey of variation in normal tissue epigenetic features has been performed across kilobase-level expanses of promoter DNA sequence to explain this inter-individual and inter-tissue variation.
Several methods have been developed to determine the methylation status of cytosines in DNA (19). These include digestion with methylation-sensitive restriction enzymes, as in restriction landmark genomic scanning (20), oligonucleotide arrays (21), pyrosequencing (22) or MS-based primer extension-based methods (23), as well as bisulfite genomic DNA sequencing (BGS) and methylation-specific PCR (MSP). MSP is now an established technology for the monitoring of abnormal gene methylation in selected gene sequences (24). MSP is a discontinuous method for assaying DNA sequence; it generally samples oligomer annealing sites of approximately 20 bases in and around known methylation CpG sites. The technique relies on bisulfite chemical treatment of genomic DNA, to chemically convert unmethylated cytosines to uracils, and the replacement of uracil, in the subsequent PCR, with thymidines. The careful design of MSP primers allows, in separate uniplex reactions, either a match or a mismatch at the CpG site in question, and therefore either a successful or unsuccessful PCR, with a categorical readout; either positive or negative. Both qualitative and quantitative MSP (25-29) require prior genomic methylation screening, to direct primer design to appropriate specific target sequence for analysis. MALDI-TOF mass spectrometry has recently been reported as an alternative genome-wide methylation mapping approach (30), but may be resource intensive from a procedural, instrumentation, and informatics perspective.
BGS offers a continuous readout of the entire, detailed, base-by-base methylation map of a genomic DNA sequence (31, 32). The technique also relies on initial bisulfite modification of DNA, and as a final step, direct cycle sequencing of the resulting PCR-amplified sequence. PCR primers are designed external to potential methylation sites. However, because of the bisulfite conversion of unmethylated C=>U in the template, there is a paucity of C (sense) or G (antisense) strand nucleotides in the PCR product. Thus, there is a skewed (low) GC content, and direct cycle sequencing results in artefactual background signal attributable to excess unused dCIP and dGTP in the sequencing reaction.
Conventional bisulfite sequencing commonly requires the cloning of PCR product for two reasons: First, the incorporation into the plasmid vector allows skewed GC content to be compensated for by the external plasmid sequence. Second, this approach provides precise methylation patterns of individual DNA molecules, overcoming tissue heterogeneity issues affecting methylation patterns at individual CpG sites. However, this requirement makes conventional BGS time consuming and labor intensive, and it precludes large-scale surveillance studies across multiple regions, genes, tissues, and donors.