DNA methylation is one of the epigenetic mechanisms for gene regulation. Variations in DNA methylation status in certain loci control gene expression by silencing or activating specific genes. The presence of a methyl group on the 5′ carbon of a cytosine belonging to the dinucleotide CG (CpG) is believed to prevent the binding of the transcription machinery in the promoter of a gene. Some loci on the genome called “tissue-specific differentially methylated regions” (tDMRs) can therefore be used for cell identification because they present different DNA methylation status across different cell types. To determine the pattern of DNA methylation at a locus, the most commonly used methods include the bisulfite modification of genomic DNA. The bisulfite chemically converts the unmethylated cytosines to uracils but does not react with methylated cytosines. During a polymerase chain reaction (PCR) the uracils get copied as thymines and the amplicons can then be sequenced to determine the presence of a cytosine or a thymine at each specific CpG.
Generally, DNA methylation occurs through methylation of cytosine residues in the CpG dinucleotide sites by DNA methyl transferases (DNMT). The methylation process can inhibit gene transcription by recruitment of chromatin remodeling factors that influence the accessibility of DNA during transcription.
Environmental factors play an important role in modifying the DNA methylation status at certain sites. For example, differences in DNA methylation status in monozygotic twins are greater for the pairs that spend less of their lifetime together or exhibit different lifestyles. Environmental exposures such as diet, stress, and smoking can alter DNA methylation at various stages of human development.
Tobacco smoking is a powerful environmental factor that changes DNA methylation. Changes in DNA methylation can mediate the effects of tobacco smoking in people which can affect gene expression in certain genetic loci. Tobacco smoking can alter DNA methylation through several mechanisms. First, smoking can modulate methylation patterns through carcinogen-induced DNA damage and repair. Various carcinogenic materials in tobacco, particularly cigarettes, such as arsenic, nitrosamines, polycyclic aromatic hydrocarbons, and formaldehyde can cause double-stranded DNA breaks. Such breaks require DNA repair which is mediated by DNA methyltransferase 1 (DNMT1) to methylate the CpGs adjacent to the repaired sites. Smoking can also alter DNA methylation status though a nicotine effect on gene expression. Nicotine has the ability to alter DNMT1 activity and affect protein expression. Third, smoking can modify DNA methylation by affecting the expression and activity of DNA-binding factors such as Sp1. Smoking increases Sp1 expression which binds to GC-rich motifs in gene promoters and subsequently to prevent de novo methylation of CpGs at these motifs. Hypoxia is another mechanism by which tobacco smoking may alter DNA methylation. Tobacco smoke contains carbon monoxide that binds to hemoglobin and reduces the oxygen levels in the tissue. In turn, hypoxia may upregulate methionine adenosyltransferase 2A that is responsible for S-adenosylmethionine synthesis, a key methyl donor for any DNA methylation. Thus, variation in DNA methylation is one mechanism that can potentially mediate the effects of tobacco smoking.
Techniques to distinguish current smokers from never smokers based on DNA methylation status are not established. The majority of techniques to determine DNA methylation to distinguish current smokers from never smokers developed so far are based on chip arrays that only provide information on a single CpG site. In addition, the array studies require large amounts of DNA and laborious bioinformatic analysis which may not be suitable for forensic or other applications with limited samples. On the other hand, the pyrosequencing-based technique permits identification and quantification of the methylation status of clusters of CpG sites associated with a genomic locus. Pyrosequencing allows highly accurate determination of methylation status at each CpG site within a genomic locus. Further, this technique utilizes a minimal amount of starting DNA material which permits downstream short tandom repeat (STR) testing as well.
A variety of chip array-based platforms (e.g. Illumina 27K and 450K) have been developed to permit a much broader investigation and identification of differentially methylated loci across the genome. Several genetic loci have appeared as robust indicators of tobacco smoking as a result of investigations utilizing these array based platforms. The first consistent locus to be discovered was the coagulation factor II (thrombin) receptor-like 3 (F2RL3), by Breitling et al. Breitling et al. used a 27K array to study the effect of smoking on peripheral mononuclear cell pellets and identified several loci that were associated with smoking including F2RL3, GPR15, and ORAI2. The second important locus to emerge was the aryl hydrocarbon receptor repressor (AHRR) uncovered by Monick et al. Monick et al. examined the effect of smoking in lymphoblast and lung macrophage DNA using the Illumina HumanMethylation 450K BeadChip to show that tobacco smoking can cause significant changes in DNA methylation patterns at various genomic loci and especially at AHRR.
A large study carried by Zeilinger et al. further confirmed the changes in DNA methylation patters at the above noted loci and extended the list of genes to include HIVEP3 and CACNA1D. Other genetic loci have also emerged to contain smoking-specific CpG sites including 2q37, 6p21.33, growth factor independent 1 transcription repressor (GFI1), myosin IG (MYO1G), CPOX, GPR15, CYP1A1, and many others.
The DNA methylation signatures of candidate sites have been shown to serve as useful biomarkers for various traits. Interest in such applications has resulted in several genome wide association studies using large scale epigenetic arrays. However, because DNA methylation analysis is mainly performed by array studies which require laborious bioinformatics analysis, applying DNA methylation is still difficult in the clinical and forensic regimes due to the complexity of the instrumentation and the need for relatively large sample quantities.