DNA methylation is an important modification of DNA base. It mainly refers to the covalent modification of methylation on the number 5 carbon atom of cytosine, which is basically present in the DNA of all species. Methylation modification of cytosine exists in the special structure of CpG and appears in pairs in the double-stranded DNA; in the vertebrate structural genome, most of the CpG structural domains are mainly concentrated in the gene promoter region, and methylation occurs on cytosine in 60%-90% of CpG regions. Methylation of DNA can lead to changes in DNA conformation, stability, interaction mode between DNA and protein, and the structure of chromatin, which in turn affects the regulation of gene expression, and thus plays an enormous role in cell development and differentiation, the expression of characteristic phenotypic genes, and X chromosome inactivation, and etc. Therefore, accurate sequencing of DNA methylation sites in the genome is an important part of a comprehensive understanding of the characteristics and functions of genes.
Traditional single-point methylation detection or sequencing methods (eg, restriction enzyme digestion, restriction enzyme digestion-PCR, methylation-specific PCR, pyrosequencing, fluorescent quantitation etc.) can only detect single or multiple sites once due to limitation in the technical method, and the method is complicated; genomic methylation map can be drawn based on the methylation map of the chip, but the method is required in a smaller amount, and the cost is higher, which is not suitable for large-scale use. With the development of second-generation sequencing methods in recent years, people can further systematically and accurately understand the distribution of methylation in the genome by high-throughput sequencing method. Currently, there are three types of methylation sequencing methods based on high-throughput sequencing: (1) immunoprecipitation; (2) bisulfite sequencing; and (3) methylated CpG random amplification and sequencing methods (MCTA-Seq). The immunoprecipitation method requires the purchase of antibodies with specific recognition effects, and this sequencing method can only be regarded as semi-quantitative, with a resolution of only about 100 bp. The bisulfite sequencing method is accurate to single base and is the gold standard for methylation analysis. In this method, the DNA sample is treated with bisulfite and the unmethylated cytosine is converted into uracil. Then, the promoter and CpG island region are enriched through enzyme digestion, gel purification, and etc, and the library is further established and sequenced; however, the method comprises complicated procedures and is time-consuming, costly and the cost performance is low, thus it does not have wide applicability. The method for random amplification and sequencing of methylated CpG is an improved method based on bisulfite method. In the method, the collected DNA samples are treated with bisulfite to obtain converted samples, and then CpG-enriched methylation regions (especially CpG island regions) are amplified through specific primers and DNA libraries are created. Then, the target region fragment is enriched by way of cutting the gel, and then the methylated CpG island sequencing analysis in a specific region is realized. This method can reduce the cost and effectively cover more than 80% of the CpG island region. It is of great significance for the analysis of DNA methylation distribution. However, this method still has three major defects: 1) the obtained DNA library always contains extremely large number of primer dimers and impurities, plus the limitations of the bisulfite sequencing method, the actual sequencing effect is greatly disturbed, thus the cost of sequencing is increased; 2) the range of CpG islands that can be captured by primers still cannot cover all the promoter regions; 3) the steps of library construction and purification of this method still rely heavily on the judgment and skills of the operators, which is disadvantageous to automated industrial operation.
There is currently a lack of a more efficient commercial high-throughput sequencing method for methylated CpG island.