A Recipe for Complexity: Building the Potato Pan-Genome

A Recipe for Complexity: Building the Potato Pan-Genome
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

For a crop so humble in appearance, the potato hides an unexpected level of genetic complexity. A lesson we learned the hard way. Our team's entry into potato genetics began in 2016 with a naive yet optimistic email. Korbinian Schneeberger, team leader (MPIPZ, Germany), reached out to Maarten Koornneef (then director of MPIPZ), eager to test a new sequencing technology (single-molecule sequencing) for haplotype phasing in heterozygous genomes. He casually suggested potato as a test case, unaware of the complexities waiting ahead.

Preparation Time: 2-6 years 

Our first attempt, led by a former group member Wen-Biao Jiao, quickly revealed that potato is a particularly stubborn ingredient. Potato genomes are autotetraploid, meaning each cell contains four copies of every chromosome. Imagine the potato genome as a puzzle with four overlapping, nearly identical layers, each subtly different. These layers, or haplotypes, made it extremely difficult to distinguish each chromosome. At the time, sequencing technologies generated smaller reads with lower accuracy resulting in smaller puzzle pieces that often contain errors. Consequently, no method could assemble them to get the individual chromosomes.

After early unsuccessful attempts, it was clear we needed a new strategy. The breakthrough came in 2019 when we turned to single-cell sequencing. José and colleagues developed haplotyping methods using single-cell sequencing of gametes (Campoy and Sun et al, Genome Biology, 2019), making potato genome assembly more manageable. By 2022, this method produced the first fully assembled genome of a tetraploid potato (Sun and Jiao et al, Nature Genetics, 2022). Surprisingly, there were high levels of haplotype sharing among the four haplotypes, raising new questions about diversity in potatoes.

(Some) Fresh Ingredients Required

The potato's story begins nearly 10,000 years ago in the Andean highlands of South America. Indigenous communities domesticated wild potato species, selecting plants suited to diverse environments. Brought to Europe by Spanish explorers in the 16th century, a single sub-species spread rapidly as a crop (Solanum tuberosum ssp. tuberosum). European potato was further bottle-necked by blight epidemics in the latter half of the 18th Century. It was here, before the start of modern breeding, that Korbinian and Hequan set out to capture the raw ingredients of Europe’s potato diversity. We pulled 21 historical cultivars out of the freezer, and used shallow sequencing to select ten that offered a good mix of flavors (aka genetic diversity) to create a pan-genome of tetraploid European potato. Scaling up to ten assemblies, how difficult could it be?

Initial optimism quickly wilted, generating ten sets of single-gamete sequencing was challenging and costly. To tackle this, we developed a novel method to assemble and phase complex genomes using an alternative sequencing technology (Hi-C). This method allowed us to reconstruct tetraploid potato genomes with far greater clarity (Figure 1). 

Figure 1. Hequan’s haplotype-resolved potato recipe

The year 2023 brought major changes, both scientifically and personally. After a decade at MPIPZ, Hequan established his own research group at Xi’an in China (XJTU). To keep momentum going, we linked MPIPZ, LMU, and XJTU into an international team. This ambitious partnership brought plenty of scheduling headaches. Sergio still grumbles about those brutal 5:30 am genome meetings in Colombia. While Hequan endured a large volume of late-night calls in China.

Just when we thought we had reached the finish line, Hequan cheerfully announced another "small improvement", forcing everyone to start over and repeat analyses from scratch. Soon, “just one more genome update” became our unofficial daily routine, courtesy of Hequan’s tireless pursuit of precision. The rest of us responded to each announcement with resigned laughter and yet another round of coffee.

Cooking time: 2 weeks

February 2024. Hequan’s continuous work to improve the assemblies soon ran up against the demands of his new position as a Professor in Xi’an. Whispers began to circulate of another potato paper in the publishing pipeline. One extra scoop at this point might have spoiled the whole recipe. Together these circumstances drove us into a flurry of work (imagining the last minute of a cooking show challenge) to bring the manuscript to submission, finally submitted at 4:47 am Central European time.

Serves: 10 

One year (and several reheated drafts) later, we finally plated up our potato pan-genome. Our first major finding was that European potatoes harbor surprisingly high sequence diversity. Much greater than previously known. This rich diversity primarily resulted from historical genetic exchanges (introgressions) with multiple wild potato species, geographically widespread, spanning from Mexico to Argentina.

Yet, despite such extensive genetic variation, potatoes presented a paradox: high diversity at the sequence level, but little variation in how those sequences recombined. Think of it like having a pantry full of spices but always cooking the same dish. Of the 40 haplotypes that we assembled, only about nine were truly unique in any given genomic region. This revealed a surprisingly narrow genetic base in European potatoes, shaped by domestication, adaptation, and breeding bottlenecks.

Surprisingly, having fewer haplotypes made it easier to develop practical genomic tools. Since many potato genomes share similar DNA combinations, we built the “haplotype-graph”, a computational tool that reconstructs genomes of new potato samples not included in the current pan-genome using short-read sequencing data. Initially, Manish confidently claimed, “Give me two weeks, it'll be straightforward”. Amusingly, this spiraled into a year-long saga filled with unexpected hurdles, pulling Craig in and entangling both in extensive analyses.

The haplotype-graph proved its worth by reconstructing the genomes of commercially important potato cultivars, including the iconic “Russet Burbank”. This demonstrates how our pan-genome provides breeders and researchers with tools to harness potato genetic diversity, driving future efforts for more resilient and productive varieties.

Future Recipes: Next Steps in Potato Genomics

With our pan-genome recipe now tried and tested, future steps promise new opportunities. Sequencing 24 more potato genomes could capture 95% of the haplotype space, bringing us closer to a truly complete pan-genome. Much like selecting special ingredients, targeted sequencing of cultivars with unique traits, especially those derived from recent breeding and introgressions, will be essential. 

Our "cooking" journey was a truly collaborative effort, driven by the dedication of team members around the world. From early mornings in Colombia to late-night genome marathons in China, India, Australia, and Germany, our diverse expertise shaped every stage of the process. The potato pan-genome may never be a finished dish, but the quest itself continues to reveal endless opportunities, complex puzzles, and always, room for improvement.

Hequan Sun, Sergio Tusso, Craig I. Dent & Manish Goel

References

  1. Campoy, J. A., Sun, H., Goel, M., Jiao, W.-B., Folz-Donahue, K., Wang, N., Rubio, M., Liu, C., Kukat, C., Ruiz, D., Huettel, B., & Schneeberger, K. (2020). Gamete binning: Chromosome-level and haplotype-resolved genome assembly enabled by high-throughput single-cell sequencing of gamete genomes. Genome Biology, 21(1), 306. https://doi.org/10.1186/s13059-020-02235-5
  2. Sun, H., Jiao, W.-B., Krause, K., Campoy, J. A., Goel, M., Folz-Donahue, K., Kukat, C., Huettel, B., & Schneeberger, K. (2022). Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nature Genetics, 54(3), Article 3. https://doi.org/10.1038/s41588-022-01015-0

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Genomics
Life Sciences > Biological Sciences > Genetics and Genomics > Genomics
Plant Genetics
Life Sciences > Biological Sciences > Plant Science > Plant Genetics
Evolutionary Genetics
Life Sciences > Biological Sciences > Evolutionary Biology > Evolutionary Genetics
Plant Domestication
Life Sciences > Biological Sciences > Plant Science > Plant Domestication
Food Science
Life Sciences > Biological Sciences > Food Science
  • Nature Nature

    A weekly international journal publishing the finest peer-reviewed research in all fields of science and technology on the basis of its originality, importance, interdisciplinary interest, timeliness, accessibility, elegance and surprising conclusions.