Imputing not available values in single-cell DNA methylation data using the median is straightforward and effective
en-GBde-DEes-ESfr-FR

Imputing not available values in single-cell DNA methylation data using the median is straightforward and effective

28/11/2025 Frontiers Journals

DNA methylation (DNAm) is one of the earliest identified types of epigenetic modifications and plays an essential role in regulating normal cellular processes, embryogenesis, and tumor development and progression. In recent years, advances in single-cell DNA methylation (scDNAm) have provided unprecedented opportunities to explore cellular epigenetic differences with high resolution. Most current studies analyzing single-cell DNA methylation data are typically based on cell-by-region matrices. A simple and effective method for constructing scDNAm data cell-by-region matrices is genome window binning, where the genome is divided into fixed-length blocks (e.g., 100 kbp), and the average DNA methylation level for each cell in each region is computed. However, before performing downstream analyses, a critical issue remains: how to handle the not available (NA) values in scDNAm data. In single-cell RNA sequencing (scRNA-seq) or single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) data, missing values are usually represented as zero read counts. However, in scDNAm data, captured methylation sites typically display a binary characteristic: methylated (read count of 1) or unmethylated (read count of 0), while uncaptured sites are marked as NA. When constructing cell-by-region matrices using the window binning strategy, due to the uneven distribution of methylation sites across the genome and the impact of window size, many regions may lack captured methylation sites, resulting in average methylation levels marked as NA. A methylation matrix with NA values cannot be used for downstream analyses, making the imputation of NA values a necessary preprocessing step.

Recently, a study by BioX lab at the School of Mathematical Sciences, Nankai University, published in the Quantitative Biology journal, titled "Imputing not available values in single-cell DNA methylation data using the median is straightforward and effective," revealed that imputing not available values in single-cell DNA methylation data using the median is a simple and effective approach.

When analyzing scDNAm data, an intuitive solution is to impute all NA as zeros. However, from another perspective, higher read counts in scRNA-seq data typically correspond to higher gene expression levels, and gene expression is strongly negatively correlated with DNA methylation levels. Thus, NA values in scRNA-seq data are usually treated as zeros, which is equivalent to imputing NA values in scDNAm data as ones. Additionally, using various statistical methods to smooth NA values presents an intuitive approach. For instance, EpiScanpy imputes NA values by using the mean methylation levels of a region across all cells. This study suggests that imputing NA values with the median is a simple and effective method for highlighting cellular heterogeneity in scDNAm data. It provides an accurate data foundation for downstream analyses and allows for more precise and reliable interpretation of the underlying biological processes.
DOI:10.1002/qub2.7000
Fichiers joints
  • Figure 1 The effect of various imputation strategies.
28/11/2025 Frontiers Journals
Regions: Asia, China
Keywords: Science, Life Sciences

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Témoignages

We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet
AlphaGalileo is a great source of global research news. I use it regularly.
Robert Lee Hotz, LA Times

Nous travaillons en étroite collaboration avec...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2025 by DNN Corp Terms Of Use Privacy Statement