Why and how we built the cactus ecological database
Published in Earth & Environment, Ecology & Evolution, and Research Data
Cacti under pressure
Cacti are among the most recognisable and popular plants (Figure 1). From towering saguaros in the Sonoran desert to small, globose cacti on windowsills in London, they capture our imagination as icons of endurance, survivors of the planet's inhospitable environments (Figure 2).
However, data is starting to fracture that image. The cactus family (~1,850 species) is now recognised as one of the most threatened 1. Roughly 31% of species are threatened with extinction, driven by habitat loss, overexploitation, and climate change. Recent projections suggest that 60-90% of species might experience range contractions in the near future 2. Even the iconic saguaros are collapsing in the Arizona summer heat. The world is changing faster than cacti, whose pace of life and reproduction is very slow, can adapt.
At the same time, global fascination with cacti has never been stronger. They are among the most traded ornamental plants, and much of that trade involves wild species rather than cultivars. This is big money too. A recent landmark case saw two cactus smugglers sentenced to prison and hefty fines 3. The stolen cacti were returned to their home country, Chile, setting a new precedent in crimes against biodiversity.
It is a funny moment for cacti. Some of the most resilient plants are simultaneously globally celebrated and at great risk. However, the data needed to understand their conservation, ecology, evolution and potential future have long been scattered across textbooks, floras, genetic databases and biodiversity repositories. It is fragmented, inconsistent, and difficult to synthesise 4.
A few years ago, we recognised that to understand the origins and future of cacti, we needed to bring these pieces together. We recently published the Cactus Ecological Database (CactEcoDB), which draws on hundreds of data sources collected over seven years by an international team 5.
Figure 1: The diversity of cacti. Nine examples of cacti exhibiting different morphologies. This diversity, coupled with their species richness and young evolutionary age make cacti one of the fastest-evolving plant groups. Cactus images are used under Creative Commons licenses with modifications allowed. Image (a) used a photo taken by Maria Vorontsova, which is licensed under the Creative Commons CC0 1.0 Universal Public Domain Dedication (https://creativecommons.org/publicdomain/zero/1.0/). Images (b) and (g) used photos taken by Dr. Hans-Günter Wagner, which are licensed under a Creative Commons Attribution-ShareAlike 2.0 License (https://creativecommons.org/licenses/by-sa/2.0/). Images (c) and (h) used photos taken by Amante Darmanin, which are licensed under a Creative Commons Attribution 2.0 License (https://creativecommons.org/licenses/by/2.0/). Image (d) used a photo taken by Jesse Pluim (BLM), which is marked as being in the Public Domain using the Public Domain Mark 1.0 (https://creativecommons.org/publicdomain/mark/1.0/). Image (e) used a photo taken by Laurent Houmeau, and image (f) used a photo taken by Sergio Niebla; both are licensed under a Creative Commons Attribution-ShareAlike 2.0 License (https://creativecommons.org/licenses/by-sa/2.0/).
How this project began
I did not expect to become a cactus researcher. I started working on cacti in 2019, when COVID derailed my original PhD plans to build insect traps in Central America. I became fascinated by the forces shaping the origins of biodiversity, especially in charismatic and unusually diverse plant groups. During my PhD, I worked on orchid evolution, cactus diversification, and the impact of the K–Pg mass extinction across all flowering plants 6.
My PhD supervisor, Dr Nick Priest the fruit fly behavioural geneticist, and I were a strange team to explore the botanical world. But being outsiders gave us freedom. We challenged ourselves (and each other), asked naïve questions, and were not afraid to be too ambitious. Along the way, we were fortunate to work with researchers worldwide who generously supported and guided what must have seemed like overenthusiastic newcomers.
For this work, I was honoured to receive the Irene Manton Prize for the best botanical PhD in 2025, and have recently been awarded an Early Career Fellowship by the Leverhulme Trust, to continue working on these questions.
No existing biodiversity database for cacti
The cactus diversification study that preceded CactEcoDB used machine learning alongside an initial dataset of cactus distributions, environmental niches, and adaptive trait variation to identify drivers of speciation rates 4. The results were interesting. Cactus speciation is complex and multifactorial, and several long-standing hypotheses, such as simplistic links to aridity and pollinators, did not hold up 7,8. But just as revealing as the biological results was what I learnt about the practicalities of studying cacti.
Unlike many animal groups, there was no central, curated database of cactus biodiversity. Not to say zoologists have it easy, but for mammals or birds it is often possible to download trait datasets from resources like Pantheria or AVONET 9,10. These datasets have lowered the barrier to research in these groups and are a powerful tool for studying biodiversity. For cacti, even the most basic information was fragmented. Traits such as plant height and growth form (seemingly obvious, easily collected variables) were scattered across textbooks, monographs, floras and taxonomic descriptions. Spatial data were sparse, inconsistent, and uncertain, and were biased by many factors. Critically, we also lacked a well-sampled molecular phylogeny for the family to enable evolutionary research.
To achieve the diversification research, I spent years assembling trait data from the literature, cleaning and curating spatial records, and gathering molecular sequence data to build a usable phylogeny. The latter was particularly challenging in cacti, where low sequence variation makes it difficult to confidently delimit lineages 11.
What surprised me most was that, despite decades of research and enormous global interest in cacti, no one had attempted to bring these disparate data together into an open, extensible, community resource. When Scientific Data extended the invitation to develop the data into a standalone database, we were honoured and made a plan.

Figure 2: The spatial distribution of species richness in cacti across the Americas. Cacti are widely distributed throughout ecosystems in the Americans, and there are several notable hotspots with elevated diversity in arid and semi-arid regions.
How we built it
To build the best and most useful database we could, we knew we had to go beyond simply repackaging our previous data. We wanted CactEcoDB to be open, up to date and transparent, a resource for the community to trust and build upon. This meant expanding the team to draw on diverse expertise and skill sets.
Catherine Martinez undertook the formidable task of recollecting plant height data and redefining growth forms into a more realistic system. Jorge Avaria-Llautureo assembled and harmonised new environmental layers. Santiago Ramírez-Barahona and Gerardo Manzanarez-Villasana led extensive curation of spatial data, replacing occurrence records with expert-defined ranges wherever possible. George Ryan assisted with the construction of the new phylogeny, which we chose to remake using information from a newly-published phylogenomic backbone. Andrew Gdaniec brought three decades of cultivation experience and more than 30 field expeditions’ worth of knowledge. Andrew caught errors and inconsistencies in plant height and pollinator data that no statistical pipeline could detect. Alongside this, Alastair Culham, Chris Venditti, Georgia Keeling and Nick Priest provided discussion, feedback and inputs that shaped the database and made it possible. CactEcoDB emerged from ongoing conversations over many years about how to make cactus research possible, more open, robust and powerful.
We also made deliberate choices to ensure transparency. We provided multiple measures of speciation rate rather than a single preferred metric. We removed pollination records that could not be independently verified, even though this reduced coverage. We documented uncertainty (e.g. of environmental niches and growth forms) rather than smoothing it away (Figure 3).

Figure 3: A selection of the trait data we assembled for CactEcoDB. These data capture many aspects of cactus biodiversity and will be useful for macroecological and macroevolutionary research.
The value of an integrated database
By integrating traits, expert-curated spatial ranges, environmental data, time-calibrated phylogenies and multiple estimates of speciation rate into a single open resource, we make large-scale evolutionary and ecological research in cacti possible. Researchers can now explore trait–environment relationships, climatic niche evolution, diversification dynamics and conservation, without spending years assembling data.
For a plant family in which nearly a third of species are already threatened, and many more face shrinking ranges under climate change, having accessible, synthesised data matters 1,2. Most of all, CactEcoDB is designed to grow. We hope it becomes something the community can refine, challenge and build upon.
An evolving database
CactEcoDB is already being used in several ongoing projects, spanning trait evolution, diversification dynamics and spatial dynamics. We have begun expanding it further, incorporating new trait data collected for a recent study 12. These data allowed us to explain the relationship between pollinators, flowers and diversification.
As new occurrence data are collected, curated and published, expert range maps are refined, new molecular datasets are published, and additional trait data become available, we will continue to update and grow the database.
Our hope is that others contribute to the growth of CactEcoDB. The more it is challenged, refined and expanded, the more useful it becomes. CactEcoDB is Open Access and available at Figshare 13 (https://doi.org/10.6084/m9.figshare.30940019.v2).
- Goettsch, B. et al. High proportion of cactus species threatened with extinction. Nat Plants 1, 15142 (2015).
- Pillet, M. et al. Elevated extinction risk of cacti under climate change. Nat Plants 8, 366–372 (2022).
- Quaglia, S. Operation Atacama: The $1m cactus heist that led to a smuggler’s downfall. BBC News (2025).
- Thompson, J. B., Hernández-Hernández, T., Keeling, G., Vásquez-Cruz, M. & Priest, N. K. Identifying the multiple drivers of cactus diversification. Nat Commun 15, 7282 (2024).
- Thompson, J.B., Martinez, C., Avaria-Llautureo, J., Ramírez-Barahona, S., Manzanarez-Villasana, G., Culham, A., Gdaniec, A., Ryan, G., Venditti, C., Keeling, G. and Priest, N.K. CactEcoDB: Trait, spatial, environmental, phylogenetic and diversification data for the cactus family. Scientific Data. https://doi.org/10.1038/s41597-026-06936-7 (2026).
- Thompson, J. Tempo and drivers of angiosperm diversification. The University of Bath’s research portal https://researchportal.bath.ac.uk/en/studentTheses/tempo-and-drivers-of-angiosperm-diversification/.
- Hernández-Hernández, T., Brown, J. W., Schlumpberger, B. O., Eguiarte, L. E. & Magallón, S. Beyond aridification: multiple explanations for the elevated diversification of cacti in the New World Succulent Biome. New Phytologist 202, 1382–1397 (2014).
- Arakaki, M. et al. Contemporaneous and recent radiations of the world’s major succulent plant lineages. Proc Natl Acad Sci U S A 108, 8379–8384 (2011).
- Jones, K. E. et al. PanTHERIA: a species-level database of life history, ecology, and geography of extant and recently extinct mammals. Ecology 90, 2648–2648 (2009).
- Tobias, J. A. et al. AVONET: morphological, ecological and geographical data for all birds. Ecol Lett 25, 581–597 (2022).
- de Vos, J. M. et al. Phylogenomics and classification of Cactaceae based on hundreds of nuclear genes. Plant Syst Evol 311, 28 (2025).
- Thompson, J. B. & Venditti, C. Fast evolving flowers drive cactus diversification. EcoEvoRxiv. https://doi.org/10.32942/X2PH1C (2025).
- Thompson, J. B. et al. CactEcoDB: Trait, spatial, environmental, phylogenetic and diversification data for the cactus family. CactEcoDB: Trait, spatial, environmental, phylogenetic and diversification data for the cactus family https://doi.org/10.6084/m9.figshare.30940019.v2 (2025).
Follow the Topic
-
Scientific Data
A peer-reviewed, open-access journal for descriptions of datasets, and research that advances the sharing and reuse of scientific data.
Related Collections
With Collections, you can get published faster and increase your visibility.
Data for crop management
Publishing Model: Open Access
Deadline: Apr 17, 2026
Invertebrate omics
Publishing Model: Open Access
Deadline: May 08, 2026
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in