Kazakh Scientists Unveil the First Comprehensive Genomic Dataset of the Great Steppe
Researchers from Nazarbayev University’s National Laboratory created the first large-scale, high-quality genotyping dataset of healthy Kazakh individuals a landmark contribution to global population genomics and biomedical research.
The study, published in Nature Scientific Data (DOI: 10.1038/s41597-025-05964-z), presents a detailed analysis of genetic diversity across 224 participants representing Kazakhstan’s major regions and tribal groups.
Mapping the Genetic Heritage of the Great Steppe
Despite the vast geographic and cultural significance of Central Asia, its populations remain largely underrepresented in global genome databases. To fill this gap, the NU team generated and analysed over 523,000 single nucleotide polymorphisms (SNPs) using the Illumina Infinium SNP Genotyping Array GSA MG v2 platform.
The resulting dataset provides an essential reference for population clustering, ancestry studies, and biomedical research revealing Kazakhstan’s genetic profile as a unique bridge between East and West Eurasia.
“Our goal was to build a genomic foundation for precision medicine and population studies in Kazakhstan,” says Prof. Ulykbek Kairov, the project’s principal investigator. “The genetic landscape of Kazakhs reflects centuries of interaction along the Silk Road, and now we have data to explore that scientifically.”
Distinctive Findings with Biomedical Relevance
The study identifies 74 population-specific variants with potential biomedical implications, particularly in genes linked to metabolism and drug response.
Among the notable examples:
-
CYP4F2 (rs2108622) — a variant affecting the metabolism of anticoagulant drugs such as Warfarin, found at higher frequency in Kazakhs than in East Asians or Europeans, suggesting a need for population-tailored dosing.
The research also found low levels of inbreeding and exceptionally low runs of homozygosity, confirming cultural practices that discourage consanguineous marriages. These genetic patterns could influence the prevalence of hereditary diseases and guide preventive health strategies in Kazakhstan.
An Open Resource for Global Research
All genotyping data are publicly available through the European Variation Archive (accession PRJEB89820) and associated code is accessible on GitHub: https://github.com/LabBandSB/KAZ-GWAS.
This open-access model invites collaboration and comparative analysis across Eurasia and beyond.
“This dataset positions Kazakhstan within the global genomic map,” notes Dr. Dos Sarbassov, co-author and project supervisor. “It will help develop population-specific healthcare solutions and strengthen international cooperation in genomics.”
Research was conducted by scientists from the Center for Life Sciences at National Laboratory Astana, with collaboration from L.N. Gumilyev Eurasian National University and support from international genomic databases.