From genome to ecosystem: CNSA accelerates open data sharing worldwide
en-GBde-DEes-ESfr-FR

From genome to ecosystem: CNSA accelerates open data sharing worldwide

15/12/2025 TranSpread

Advances in sequencing technologies have transformed life science research, enabling multi-layered exploration of biological systems across species, tissues, and developmental stages. However, this data explosion presents major challenges, including standardized storage formats, quality control, cross-database interoperability, and scalable data delivery. Large-scale global genome sequencing projects—such as the Earth BioGenome Project and numerous organism-wide genomic initiatives—depend on reliable systems capable of handling diverse data types ranging from whole genomes to spatial transcriptomes. Traditional repositories alone are insufficient to support these broad and evolving needs. Based on these challenges, deeper research into efficient multi-omics data archiving and open-sharing frameworks is required.

Researchers from the China National GeneBank (CNGB) have published (DOI: 10.1093/hr/uhaf036) the 2024 update of the China National GeneBank Sequence Archive (CNSA) in Horticulture Research on May 1, 2025. The report details major advances in CNSA’s data scale, data types, visualization tools, international certification, and role in supporting global multi-omics research. CNSA now archives more than 16.3 petabytes of biological data from over 560 institutions worldwide, making it one of the largest open-access repositories for life science data.

CNSA provides public archiving and open-sharing services for a broad spectrum of biological data, including genome assemblies, raw sequencing reads, gene expression matrices, variation data, metabolomics profiles, viral sequences, and single-cell and spatial transcriptomic datasets. As of August 2024, the repository includes 1,122,067 samples, 1,766,269 sequencing datasets, and 125,855 genome assemblies, representing 7,521 species, supported by 47 sequencing platforms. A key update is the addition of a spatial transcriptomics archiving system, which captures tissue section metadata, image files, barcoding information, and spatial gene expression matrices, integrated with an online viewer that enables cell-type annotation, spatial region segmentation, and cell–cell interaction analysis. CNSA now supports high-speed data access through FTP, HTTPS, and Aspera transfer protocols, and has received formal certifications including CoreTrustSeal, FAIRsharing, and re3data, demonstrating global compliance with data management and preservation standards. CNSA also contributes to major international projects such as the 10KP Plant Genome Project, the Earth BioGenome Project, and the SpatioTemporal Omics Consortium, accelerating discovery across evolution, agriculture, ecology, and human health.

“Open and well-curated biological data resources are essential to advancing global scientific collaboration,” the authors noted. “The continued development of CNSA reflects the growing need to archive, preserve, and share complex multi-omics datasets at scale. By integrating quality control systems, standardized metadata formats, visualization platforms, and international interoperability frameworks, CNSA provides researchers worldwide with the tools required to accelerate genome science and biodiversity conservation.”

The updated CNSA platform supports broad research applications in plant and animal genomics, crop breeding, evolutionary biology, microbial ecology, medical research, environmental monitoring, and biodiversity protection. Its open-access structure encourages data reuse, reduces duplication, and supports integrative analyses that combine genomics, transcriptomics, phenotyping, and spatial mapping. Future developments will integrate artificial intelligence-assisted data curation, application programming interfaces (APIs), and cloud computing platforms to enable large-scale data analysis without requiring local storage. These advancements will further enhance CNSA’s role as a critical global infrastructure for accelerating biological discovery and supporting sustainable management of genetic resources.

###

References

DOI

10.1093/hr/uhaf036

Original Source URL

https://doi.org/10.1093/hr/uhaf036

Funding information

This study was supported by the Guangdong Genomics Data Center (2021B1212100001), Shenzhen Science and Technology Program (KQTD20230301092839007), Biological Breeding-National Science and Technology Major Project (2023ZD04073), and the China National GeneBank.

About Horticulture Research

Horticulture Research is an open access journal of Nanjing Agricultural University and ranked number one in the Horticulture category of the Journal Citation Reports ™ from Clarivate, 2023. The journal is committed to publishing original research articles, reviews, perspectives, comments, correspondence articles and letters to the editor related to all major horticultural plants and disciplines, including biotechnology, breeding, cellular and molecular biology, evolution, genetics, inter-species interactions, physiology, and the origination and domestication of crops.

Paper title: The China National GeneBank Sequence Archive (CNSA) 2024 update
Attached files
  • The system architecture of CNSA. The access layer processes user requests through various security measures before reaching the application layer. The application layer is built on Django and utilizes Redis for caching to enhance processing speed. The data layer manages operational data and metadata using a database system and file system, employing PostgreSQL along with NAS and Lustre for efficient data sharing and preservation.
15/12/2025 TranSpread
Regions: North America, United States, Asia, China
Keywords: Science, Agriculture & fishing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2025 by AlphaGalileo Terms Of Use Privacy Statement