DNA Methylation Research Methods-Comparison of Advantages and Disadvantages of Methylation Analysis Methods
DNA methylation is a post replication chemical modification reaction that affects CG double bases. During this process, the methyl group (CH3) is covalently modified onto the cytosine C of the CG base. This modification is catalyzed by a enzyme called DNA methyltransferases (Dnmts), which occurs at the C5 position of cytosine C at the CG double base, commonly referred to as the CpG site (Figure 1). The methylation of CpG sites is dynamically changing, and the methylation profile varies across different genomes. The high-frequency methylation region of the CpG site is usually located in the promoter region of a gene, known as the "CpG island". Due to DNA methylation being an enzyme catalyzed process of transferring methyl groups to CpG double bases, a methyl donor compound is required to provide the methyl group. Methyl groups generally come from diet, and common ones include folate, betaine, and vitamin B12. These methyl donor compounds can ultimately affect the metabolism of methionine and S-adenosylmethionine (SAM), with SAM being the primary methyl donor for certain methyltransferases. So far, people have learned about some methyltransferases (Dnmts), whose functions vary but sometimes overlap. Dnmt3A and Dnmt3B involve the establishment of methylation profiles for unmethylated DNA in early development, and are therefore referred to as de novo methyltransferase. Dnmt1 tends to act on semi methylated DNA, which is crucial for maintaining the DNA methylation profile. DNA methylation complex machines have been found in some algae, fungi, plants, invertebrates, and vertebrates.
The most well-known mechanism by which DNA methylation affects gene expression is that methylation affects the binding of transcription factors to DNA, thereby inhibiting gene expression. However, DNA methylation can also promote gene expression, as DNA methylation can affect the binding of certain insulators, allowing attachment enhancers to function.
The Reprogramming of Mammalian Methylation During Development DNA methylation profiles are particularly prone to reprogramming during certain critical periods, including fertilization to blastocyst implantation and early embryonic differentiation. But other sensitive periods have also been described. Although epigenetic reprogramming in the uterus is traditionally considered irreversible, recent studies have shown that maternal diet mediated epigenetic changes can be reversed by folate intake in offspring during adolescence, although the mechanism behind this is not yet fully understood.
At the beginning of embryonic development, most epigenetic imprints will be erased, and then the DNA methylation profile will be reconstructed at the onset of sex determination. In addition, other epigenetic changes may also be involved in the initial stage of embryonic development. The period of epigenetic reprogramming represents a sensitive period of individual development to external factors, and this epigenetic spectrum can be induced and permanently transmitted between generations.
In recent years, it has been clear that DNA methylation and demethylation are essential biological events for epigenetic regulation during development.
DNA Methylation and Its Relationship with Diseases
For the pathogenic mechanisms of adult individual diseases, researchers generally believe that the epigenetic processes during their development are the starting point of the disease. Recently, it has been found that a series of diseases contain causes of epigenetic elements, such as allergic reactions, liver cancer, gastric cancer, asthma, colon cancer, prostate cancer, HIV latency, metabolic diseases, and cardiovascular and cerebrovascular diseases. Some studies have also described the relationship between DNA methylation and maternal effects, as well as sociobiological manifestations such as behavior, depression, and brain diseases. Therefore, the use of bioinformatics and a series of reliable biological research methods is extremely important for detecting DNA methylation.
Detection Methods for DNA Methylation
In the past 30 years, with the advancement of epigenetic marker mining technology, people have been able to deeply understand the role of epigenetics in medicine and biology. At present, there are three research methods for DNA methylation: global methylation, local methylation, and genome wide methylation.
Global methylation analysis?is the first developed analytical method, mainly used to determine the overall DNA methylation level of the genome, and local methylation changes are not considered. This method mainly involves the catalytic modification of radioactive methyl donors derived from SAM into DNA samples by methyltransferase Sss1. Furthermore, the cleavage activity of restriction endonucleases sensitive to CpG site methylation is utilized to distinguish between DNA methylation and non methylation. In recent years, methods used to measure global DNA methylation have also included enriching methylation through cytosine methylation specific antibodies and then detecting it using fluorescence quantification; High performance liquid chromatography mass spectrometry combined with UV (HPLC-UV) or tandem mass spectrometry can be used to directly quantify methylated cytosine (LC-MS/MS); Enzyme linked immunosorbent assay (ELISA) and the use of short repeat sequences or linear elements to represent the methylation status of the genome, where 5mC can be detected using biotin streptavidin immunoassay, ELISA, or pyrophosphate sequencing. The main limitation of the global methylation analysis method is that it can only detect overall DNA methylation changes, while ignoring local changes. However, it is precisely the changes in local DNA methylation that are relatively important, as most of the effects brought about by compounds are only reflected in local rather than global methylation changes.
Local methylation analysis?refers to the methylation profiling of specific genes or regions of the genome. Initially, restriction enzyme digestion with differential methylation sensitivity was used in combination with PCR amplification and Southern hybridization to analyze the specific localized methylation status. The most commonly used pair of restriction endonucleases is HpaII and its homologous enzyme MspI. Both enzymes recognize CCGG and cleave the middle position of CG. However, Mspi is methylation insensitive, which means that CG can be cleaved regardless of whether methylation occurs. And for HpaII, when CG undergoes methylation, CG cannot be cleaved (Figure 2). However, the most commonly used method for analyzing local DNA methylation changes at the CpG site level is sulfite sequencing. This method was initially described by Frommer et al and has been optimized by many researchers. The principle is that sulfite can convert unmethylated cytosine C into uracil U, but it has no effect on methylated cytosine. Then, PCR amplification and sequencing are performed to distinguish the differences in this transformation. Sequencing confirmed that methylated C remained C, while non methylated C was transformed into U and amplified by PCR to form T (Figure 3). In fact, different methods can be used for detection after sulfite treatment, such as COBRA method (combined sulfite restriction analysis), which involves PCR amplification of sulfite treated samples and restriction enzyme digestion analysis; Direct sequencing method; Cloning sequencing; Pyrophosphate sequencing or mass spectrometry analysis. There are also ready-made design programs for methylation specific primers. If real-time fluorescence quantitative PCR is used to analyze methylation specific PCR products, changes in dissolution curves and Ct values can be analyzed. There is also a variant based on PCR detection method, COLD-PCR, which uses a lower denaturation temperature to selectively amplify non methylated DNA fragments. The direct sequencing method can perform simultaneously on several loci. But there are also problems, such as the signal strength of T to C being too high during sequencing, resulting in the signal of C being masked. Moreover, the sequencing software does not provide quantitative signals, but rather standardized ones, which cannot truly reflect the true proportion of T/C signals superimposed on each sequencing peak of a CpG site. In view of this, measuring DNA methylation through direct sequencing requires correction through algorithms or experiments. For example, one of the correction algorithms is to use signals from other positions T as a reference to correct the problem of excessive signal strength from T to C. Recently, researchers have developed a complete process for sulfite sequencing analysis. In addition, direct monoclonal sequencing after sulfite treatment conversion can correct this issue. This is because direct monoclonal sequencing represents the state of a single starting DNA molecular site. But this will bring new problems: the more molecules to be detected, the more clones that need to be sequenced; If only a few clones are sequenced and analyzed, it will cause significant errors. The digital sulfite sequencing method simulates DNA methylation signals by diluting to achieve single-molecule level amplification, thus replacing the analysis method of monoclonal sequencing. Although pyrophosphate sequencing can generate quantitative differentiation of methylation status signals, its disadvantage is that the sequence analyzed for each amplification is very short, with an average of only about 150bp, which limits the analysis of CpG. However, the combination of sulfite treatment and mass spectrometry detection technology can greatly increase the length of DNA methylation fragments analyzed (up to 500bp).
Researchers have also developed methylation DNA separation methods based on electrophoresis technology. This method requires the use of denatured gradient polyacrylamide gel, which is very time-consuming and laborious. In recent years, researchers have developed a rapid colorimetric analysis method based on methyl binding protein (MBD) to analyze the methylation status of whole genome or low abundance specific genes. Firstly, the genomic DNA is digested by restriction endonucleases. Then, the DNA ends are flattened using Kelnow polymerase and biotin labeled dNTPs (biotin dNTPs). Then, MBD immunomagnetic beads are used for sorting. Finally, HRP (horseradish peroxidase) (SA-HRP) labeled with streptavidin (SA) is used to quantitatively analyze the amount of methylated DNA (Figure 4). If methylation of a specific gene needs to be detected, biotin dNTPs can be used to amplify the gene, and then SA labeled magnetic beads or SA-HRP can be used for detection. This method can also be combined with electrochemical methods.
The first method to examine DNA methylation of specific genes or regions at the genomic level is also achieved using restriction endonuclease analysis, which is sensitive to differences in methylation. The AIMS (Amplification of Intermethylated Sites) method utilizes methylation sensitive SmalI and its methylation insensitive homologous enzyme PspAI to cleave genomic DNA, and then connects the ligands for further genome-wide PCR reactions. Differential amplification represents changes in DNA methylation status. Similar methods such as HELP (HpaII tiny fragments Enrichment by Ligation mediated PCR) utilize both HpaII (methylation sensitive) and its homologous enzyme MspI (methylation insensitive).
One of the most commonly used methods for analyzing regional DNA methylation changes at the genome-wide scale is a MeDip based immunoprecipitation of methylated fragments. This method utilizes anti methylated cytosine antibody (MeDIP) immunoprecipitation (IP) to enrich methylated DNA fragments, and then hybridizes with a chip to identify DNA methylation sites (MeDIP chip). This method was used to map the methylation profile of Arabidopsis thaliana, human breast cancer and human major tissue complex (MHC). The MeDIP chip method has significant advantages in mapping methylation changes at the whole genome level. However, the issue of false positives requires validation using local methylation analysis methods. Another method for analyzing changes in the methylation status of the entire genome is to use the methylation dependent endonuclease McrBC to remove methylated DNA fragments, and then analyze which fragments disappear through chip hybridization and control comparison. Recently, researchers have used the Infinium HumanMethod27 Bead human methylation chip to analyze methylation changes in disease-related DNA fragments. This method is a variant of the Golden Gate SNP genotyping analysis method, as the target site SNP (C/T transition) represents a change in CpG methylation.
领英推荐
The upgraded product Infinium HumanMethod450 (HM450) from the HumanMethod27 Bead Array, a human methylation chip, has been widely used in recent years. This chip can detect up to 485000 CpG sites, including human DNA coding regions and non coding regions (including miRNA promoters, 5 'and 3' UTRs). There is another methylation chip, VeraCode Methylation Array, also from Illumina, that can be used for the analysis of non-human DNA. Thanks to the popularity of these analytical chips, a large amount of data has been generated in recent years. Correspondingly, the processing flow of data and computer-based prediction methods are also continuously developing.
MeDIP is often combined with sequencing, as it is a DNA fragment that undergoes immunoprecipitation methylation, resulting in a much simpler initial sequencing sequence. There are also other DNA methylation capture techniques, such as the analysis technique using MeCP2 binding sequencing, also known as the MethylCap seq technique. Sometimes researchers first use methylated insensitive restriction endonuclease MspI to digest the genome and produce fragments containing most of the CpG, which can reduce the initial sequencing amount to 1%. The digested fragments are then treated with sulfite and sequenced. This method is called RRBS (Reduced Representation Bisulfite Sequencing). Comparatively speaking, MeDIP seq, MethylCap seq, and RRBS all provide accurate data on methylated DNA; The difference is that MeDIP seq and MethylCap seq methods cover a wider range of genomic locations, while RRBS has limited coverage in regions with lower CpG content. The MethylCap seq method can detect more differentially methylated regions (DMR) than the RRBS method, while the RRBS method can detect more DMR than the MeDIP method.
Recently, researchers have compared the HM450 chips of MethylCap seq and Illumina and found that these two methods are basically complementary: HM450 is more sensitive, while MethylCap seq can cover more methylation sites.
In recent years, a highly promising whole genome DNA methylation analysis method called deep sequencing has emerged, which can not only accurately quantify but also achieve CpG resolution. This powerful technique is used to draw methylation maps of Arabidopsis and humans. This method also utilizes sulfite treatment and then connects the adapter for whole genome sequencing. However, compared to the DNA methylation enrichment and sequencing method, it is relatively expensive, which also limits its application. Recently, a method called single molecule real-time sequencing (SMRT) has emerged. This method involves incorporating fluorescent labeled nucleotides into the complementary strands of DNA. The differences in base incorporation kinetics can be used to distinguish epigenetic markers such as methylation, hydroxylation, etc., thus bypassing the step of sulfite conversion.
DNA Methylation Kinetics Research Tool
In the past few years, DNA methylation research has been a hot topic in the field of epigenetics. In mammals, DNA demethylation occurs through TET mediated oxidation of 5mC (5 methylcytosine) to 5hmC (5-hydroxymethylcytosine), 5fC (5-formylcytosine), and 5caC (5-carboxycytosine). There are currently two main detection methods for this: one is sulfite treatment independent, and the other is based on sulfite treatment combined with methylcytidine transferase treatment.
Non dependent method for detecting 5fC at the whole genome level using sulfite treatment. This method mainly utilizes chemical labeling of 5fC, coupled with biotin for enrichment, then connects the linker and uses second-generation sequencing technology to identify the 5fC site (the chemically labeled 5fC will transform into T during the PCR process).
The second method is called methylation assisted sulfite sequencing (MAB Seq). This method is used to detect 5fC and 5caC at the single base level. The principle is to use SAM based methyltransferase M. before sulfite treatment SssI treated DNA (so that unmodified C was converted to 5mC), and then detected 5fC and 5caC (Figure 5). An improved method is called RRMAB seq (reduced representation MAB seq), which can increase the coverage of CpG areas and reduce costs.
Detection of 5hmC using AbasI sequencing technology
Based on AbasI sequencing technology, the restriction endonuclease AbaSI selectively binds to 5gmC instead of 5mC or unmodified C, resulting in a double stranded DNA gap at the 3 'end of the binding site. The sequence of AbasI tendency recognition is shown in Figure 6A. The principle of this method is to utilize T4 β Glucosyltransferase converts 5hmC into 5gmC, which is then digested by AbasI to connect DNA double stranded ligands modified with biotin at the 3 'protruding end. Then, after fragmentation and enrichment, a connector containing dT is connected to the other end, which is beneficial for the next step of dA tail binding. Then, PCR amplification and sequencing were performed (Figure 6B).
On the basis of this method, Mooijman et al. developed a genome-wide detection method for 5hmC on a single cell. In this method, researchers replaced biotinylated ligands with cell specific barcode sequences, Illumania 5 '- ligands, and T7 promoters. Glucosylated DNA is transcribed in vitro for sequencing. The obtained RNA was interrupted and integrated into the RNA sequencing library (Figure 7).
Conclusion
Due to the progress of epigenetic research and the correlation between epigenetic changes and several non communicable diseases, reliable quantitative DNA methylation profiling analysis is becoming increasingly important. In the past 30 years, the technology for studying DNA methylation changes has steadily developed, and its research methods have evolved from global DNA methylation research that initially ignored local changes to reliable quantitative methylation analysis for a specific gene or genomic region. The current research methods for analyzing DNA methylation changes at the CpG resolution level at the whole genome scale have also gradually matured. Especially in recent years, with the explosive development of sequencing technology, the cost of DNA methylation research based on DNA sequencing technology has been greatly reduced, making research on DNA methylation increasingly popular in laboratories worldwide. It can be foreseen that in the near future, with the further reduction of sequencing costs, analyzing all CpG loci in the entire genome will become feasible. However, selecting methods for a specific research topic remains a significant challenge. Recent reviews have specifically compared the characteristics of DNA methylation research techniques, such as computer algorithms for predicting appropriate experimental methods (Fig 8).
However, its challenges are also evident. With the massive production of data, bioinformatics analysis has become essential, especially for genome scale research. At present, the focus of epigenetic research is on the molecular mechanisms of diseases and reproductive developmental biology. The cost reduction of analytical methods has enabled methylation research to be widely applied not only in these popular fields, but also in other fields of biology and more model organisms. In addition, the reduction in costs has also led to an increase in the sample size for analysis, resulting in higher quality data that is more conducive to comparing epigenetic states.