DNA methylation is an important epigenetic phenomenon that plays a critical role in regulating natural cellular function, embryonic development, disease initiation and tumorgenesis. DNA methylation, especially the levels of methylation across a gene promoter region, directly affects transcription activity and regulates gene expression, thus making DNA methylation a decisive player in cellular biology and behavior. Currently, DNA methylation is considered one the most important research subjects in epigenetics and life science.
A number of methods are known for measuring DNA methylation. They can be classified into the following three categories based on their principles:
1. Methods Based on Methylation-Sensitive Restriction Endonucleases
Methylation-sensitive restriction endonucleases (MSREs) are DNA methylation sensitive endonucleases. The DNA cleavage created by these endonucleases can be blocked as long as there is a methylated base in the restriction site, which is then detected by Southern Blot or PCR. HpaII and MspI are the most commonly used endonuclease pair in methylation detection, wherein both endonucleases recognize the same sequence. However, HpaII is sensitive to methylation while MspI is not. An advantage of such methods is simple manipulation and the disadvantage thereof is the limit of the restriction sites which largely limits the methylation region available to research.
2. Methods Based on the Antibody Against DNA Methylation
This method is based on using an antibody against methyl-cytosine or a DNA methylation-binding protein. The principle and manipulation is similar to ChIP (Chromatin immunoprecipitation). The targeted and purified antibody DNA fragments can be used to hybridize to the microarray (ChIP on chip) or sequenced by next-generation sequencing ChIP-seq. The main advantage of this method is that it allows for studying DNA methylation on a whole genome wide scale. But unfortunately at the same time, it cannot produce accurate methylation measurements at single base resolution. Additionally, the accuracy of the DNA methylation detected by these methods are easily affected by the GC content in the genome DNA sequence, which leads to low accuracy in regions with low GC content.
3. Methods Based on Sodium Bisulfite Conversion
By far, sodium bisulfite conversion is the most widely used method for DNA methylation detection. The advantage thereof allows accurate detection of DNA methylation at single base resolution. The main principle of this method is that un-methylated C (cytosine) can be converted into U (uracil) while methylated C will not change when DNA has been treated by sodium bisulfite. Afterwards, the specific region at which sodium bisulfite has been converted into DNA is amplified via PCR, and the methylation level of this genome region can be obtained by comparing with the original sequence.
DNA methylation has become a hot topic in recent years, and the conventional methods for DNA methylation detection can no longer meet the standards of current research requirements. Owing to the development of high-throughput sequencing. It had been improved and developed from single gene detection to whole-genome level measurement. Many new methods are derived from the combination of the above three methods and high-throughput sequencing technologies, such as MeDIP, RRBS, HELP etc., out of which the most accurate and high genome coverage method is MethyIC-seq. The principle of MethyIC-seq is to directly sequence the sodium bisulfite converted DNA fragment by next generation sequencing. Theoretically, the methylation level of single base pairs over the whole genome can be obtained through an analysis of sequencing results. However, this analysis process holds numerous obstacles: {circle around (1)} most cytosine (C) in the genome will be converted into thymine (T) after sodium bisulfite treatment and result in an imbalance of nucleotides and low complexity in obtained DNA sequencing reads, which limits its mapping efficiency to reference sequences. Moreover, the methylation information in some low GC content regions cannot be obtained even by increasing the amounts of sequencing output. Therefore so far, we still do not have a complete map of whole genome DNA methylation from any one cell type or tissue. Felix Krueger etc. has described the challenges in analyzing sequencing data of DNA methylation in detail in Nature Method (Nat Methods. 2012 Jan. 30; 9 (2):145-51.); {circle around (2)} there are defects in the design strategy of DNA methylation detection, causing a strong tendency for it to detect only regions with high methylation due to its lack of sensitivity to low methylation, low CG content and repeat sequences.
To sum up, while MethyIC-seq is the best method for DNA methylation detection thus far in comparison to other available methods, its design defect, detection tendency thereof and obstacles in bioinformatics analysis greatly hinder its application. In this, we introduce the concept of positioning sequencing, which is used in our invention. It is capable of entirely solving the abovementioned problems and improve whole genome DNA methylation detection overall.