Behind the Paper

The enchantment behind the masks

For those navigating the ups and downs of scientific research, and those interested in burden tests in genetic association studies.

Trang Nguyen May 15, 2026

We are going for our families. We are going for our teammates. We are going for all humanity.

Astronauts of Artemis II, April 2026.

Cover photo by Sanni Sahil on Unsplash

Author’s note: I debated for days on what to write in this blog. Should it be a less technical version of the paper? Should it explain why our research is useful? Should it discuss the future of the field? If you are interested in these topics, your favorite AI tool would probably address them better than I could – and I’m saying this with sincerity. So, I’m going to follow both the letter and the spirit of the series. Because behind the scenes of this paper was more than the data, the analyses, and the endless hours of thinking and writing. Behind the scenes of this paper was a girl falling in love…

It all started in summer 2021 when I joined the mission to decipher the genetic causes of human suffering. On this mission, we use genetic and clinical data to figure out which variants, genes, pathways, cell types, etc. are influential to a disease or trait. The most intuitive level of discovery is genes since each gene is a coherent and somewhat well-characterized biological unit. One common approach to identify relevant genes is burden tests in whole-exome sequencing studies. In burden tests, we often aggregate rare, coding variants (i.e., mutations) within a gene and calculate their collective association with a phenotype.

But which variants really matter? How do we know? Most variants are harmless –including them all muddles the signals but including too few might not be enough to detect any. Among the millions, no, billions of variants, any of which could stray away from the norm in any human that’s ever lived, we know for sure the impacts of very few. The rest is mystery. The best we could do is to predict them with bioinformatic algorithms. Like a child gazing upon the stars and realizing the universe’s vastness, I was amazed and intrigued.

My infatuation, however, quickly turned into incomprehension. The universe was filled with rocks, not roses. As I read more and more, I realized that no two studies leveraged these algorithms the same – they all created their own ‘masks’ to combine them in order to filter and group variants. Some only selected ‘loss-of-function’ variants, some lumped them with ‘likely deleterious’ variants but wait, how each study defined these categories also differed from each other. Some employed two masks, some did more than ten. Anecdotally, there seemed to be no consensus! What happened to standards and guidelines and reproducibility in research?? Confused and concerned, I put on a full hazmat suit and dug in. I manually reviewed hundreds of publications, recorded hundreds of masks and found that most of them were indeed not repeated.

My colleagues and I then conducted burden tests using these masks and obtained widely varying gene-trait association results across masks and consequently, across publications. Same data, same analysis procedure, same statistical tests but different outcomes. Imagine going to the same restaurant, ordering the same dish, made by the same chef, with the same ingredients but every day, the chef uses a new ratio of the ingredients. What stands between a Michelin-level ribeye steak and a mis-steak may just be a pinch of salt. What stands between a true association and a false negative may just be a pinch of variants.

Throughout this process, my fluster gradually crystallized into a deep appreciation for those who had come before me. There was hardly any manual, yet they kept tinkering, each in their own way. But it also meant there was work to do. Of all the reviewed masks (and some new ones), we wanted to crown the best and brightest – a set that would yield the highest number of significant findings, and reliably so across many phenotypes, at least in our dataset. We tried many approaches, from brute-forcing all masks to using the highly cited ones to clustering. We finally came up with a method that directly maximized the number of significant associations. We validated the set of masks derived by this method and proposed that it be the default baseline to use in burden tests of future studies. I was proud of our work. It may not be the love of my life yet, but it was definitely a committed relationship that I wanted to announce to the world.

Until the reviewers and editors arrived… and waved the red flags. Not one. Not two. Not three. I stopped counting. Broken was not the ground. Broken was my heart. My baby was shredded and scrutinized. So I closed them all. My laptop. My eyes. And the door to greatness. How could this be? I was enchanted, but it was not forever and always; it was a cruel summer and now my tears ricochet…

…

48 hours of therapy with TayIor Swift later, I came to my senses. I shook it off because this was the life of a scientist. I re-read the reviewers’ comments and the editors’ guidance with composure and realized that they were all constructive feedback aimed to deepen and expand our research. They meant good, not harm. Following their suggestions, we examined the masks across a wider range of traits, datasets and ancestries, explored the impact of other factors in burden tests, and discussed the utilities of the baseline masks beyond gene discoveries. Two whole sections, five extended figures, and even more insights were added. If used as a default, the baseline masks would improve transparency and replicability across new studies; if not, they could act as a benchmark to evaluate others. Of course, the final manuscript was not perfect, but the revisions amplified its impact and novelty. Although I appreciate the acceptance of our work, I am even more grateful for the tough but kind lessons from the reviewers and editors, the tremendous effort from my colleagues, and above all, the (mostly) unwavering faith that my mentor, Dr. Jason Flannick, had in me.

This was my first paper, and I’d like to think that throughout the course of my career, the red flags will be sewn into the red carpet to awe and wonder. Continuously and indefinitely. After all, when the professional masks come down, scientists are hopeless romantics. Frequently despaired by the practical constraints of the research process, but forever in love with the limitless unknown.