Capturing Diversity: A First Look at the Arab Pangenome

Arab populations are underrepresented in genomic references, impacting diagnosis and personalized care. We built the first Arab Pangenome Reference to address this. Our Nature Communications paper: https://www.nature.com/articles/s41467-025-61645-w
Capturing Diversity: A First Look at the Arab Pangenome
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

When we embarked on this study, we knew we were addressing a major gap in human genomics. Despite the availability of increasingly sophisticated genome references, Arab populations remained largely absent. We wanted to change that, and with support from regional collaborators, we began assembling what would become the first draft of the Arab Pangenome Reference (APR). Our process relied on cutting-edge sequencing technologies: high-fidelity long reads, ultralong nanopore reads, and Hi-C data. These approaches allowed us to build highly contiguous, haplotype-phased genome assemblies from 53 individuals representing diverse Arab ethnicities.

What we found exceeded our expectations. We discovered 111.96 million base pairs of novel sequences absent from even the most comprehensive human references like GRCh38 or T2T-CHM13. More importantly, gene duplications, such as TAF11L5 was consistently duplicated across all individuals we studied. We also uncovered millions of population-specific small variants and hundreds of thousands of structural variants, many with potential biomedical significance. This included a number of duplicated genes associated with recessive conditions, offering clues into genetic disease burdens within Arab populations. Even our exploration of mitochondrial genomes revealed previously uncharted sequence variation.

What started as a technical challenge turned into a broader mission; one rooted in equity, representation, and the future of personalized healthcare. By offering a high-quality, open resource based on Arab genomic diversity, the APR not only addresses historical gaps but also invites deeper collaboration, research, and policy change. We envision this resource being used to power more accurate variant interpretation, support population-specific GWAS studies, and improve rare disease diagnostics in the region. The journey has only begun, but we hope this work serves as both a scientific contribution and a call to action: to ensure all populations, especially those long underrepresented, are part of the genomic future.

 

Access the Data

The data supporting the Arab Pangenome Reference study are publicly available:

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Biomedical Research
Life Sciences > Health Sciences > Biomedical Research
Research Data
Research Communities > Community > Research Data

Related Collections

With Collections, you can get published faster and increase your visibility.

Women's Health

A selection of recent articles that highlight issues relevant to the treatment of neurological and psychiatric disorders in women.

Publishing Model: Hybrid

Deadline: Ongoing

Advances in neurodegenerative diseases

This Collection aims to bring together research from various domains related to neurodegenerative conditions, encompassing novel insights into disease pathophysiology, diagnostics, therapeutic developments, and care strategies. We welcome the submission of all papers relevant to advances in neurodegenerative disease.

Publishing Model: Hybrid

Deadline: Dec 24, 2025