Sevahn: When I came to Steve as a first-year graduate student in 2019, I expressed my enthusiasm for diagnostic technology and started out on the cell free RNA (cfRNA) side in the lab. Many of my lab mates worked on single cell transcriptomics, which I found equally exciting – and the plots were unlike anything I had ever seen before! I spent my first summer exploring the fields separately, sequencing cfRNA and working on a separate project with Tabula Muris. I gave my first-ever Quake lab meeting presentation on the cfRNA side and proposed analyzing tissue-of-origin as part of what I planned to work on next. Afterwards, Steve and I debriefed in his office.
Steve: I pulled up the original decomposition of the cell free transcriptome into contributing tissue types, which was done by Winston Koh as part of his thesis in my group, and wanted to repeat the analysis with cell types instead of tissues. Tabula Muris, the first mammalian whole organism cell atlas, was in full swing and we were getting the Biohub cued up to launch Tabula Sapiens, a landmark effort to build a reference atlas of all the cell types in the human body. We had seen a lot of great basic science come out of the various cell atlas efforts in the community, but it all seemed pretty far from medical translation. I wanted to see if Sevahn could do that here by intersecting what had previously been two quite separate lines of research in my lab: single cell transcriptomics and liquid biopsies.
Sevahn and Steve: Thousands of single cell papers are produced each year, facilitating unprecedented insights into the transcriptional heterogeneity driving complex biological systems. However, applications of these datasets to the clinic remain limited in scope. But there was a problem that hindered our ability to directly apply existing datasets: all of the atlases we could work with were only on a single tissue or a small set of tissues at-best. Integration of these datasets was a major obstacle given pervasive batch effects from the different locations, donors, and protocols used to generate these data.
And so began our journey to apply these cell atlases directly to problems impacting human health. With Tabula Sapiens underway, we first sought to decompose cell free RNA from healthy patients into its cellular origins using Tabula Sapiens. We found that not all cell free RNA originated from hematopoietic cell types, observing distinct transcriptional contributions from solid tissue-specific cell types, including those from the intestine, liver, lungs, and kidney.
Given the proportion of the healthy cell free transcriptome from non-hematopoietic cells, we then wanted to see if we could measure cellular pathophysiology noninvasively. But it was the middle of lockdown and with limited lab time, so we got resourceful. Several exciting cell free RNA datasets had just come out: Munchel at al published a preeclampsia cohort, Chalasani et al published a non-alcoholic fatty liver disease (NAFLD) cohort, Toden et al published a cohort sequencing plasma from Alzheimer’s disease (AD) patients, and Ibarra et al demonstrated that cell free RNA reflects dynamics of living cells following bone marrow ablation. The at-best resolution of all these studies were immune cells or tissue-resolution. We wanted to see if we could noninvasively measure what was known from histology about cell type specific effects in some of these diseases.
The challenge was that at the time, Tabula Sapiens was only partially complete and comprised only 15 organs and tissues. Given that cell free RNA is a mixture of transcripts reflecting the health status of all the tissues in the body, we needed to be very careful about what we defined as cell type specific. We focused on cell types specific to a single tissue and derived a basic proof. First, we required that a gene be differentially expressed in a given single cell dataset. Second, we used bulk data from the Human Protein Atlas RNA dataset to compute the Gini coefficient, a measure of income inequality in economics, for each gene and only kept genes passing a certain threshold of expression inequality in the organs throughout the human body.
This led to a principled approach which enabled repurposing any published individual tissue cell atlas into a set of genes specific to a given cell type in context of the whole body that we could then examine noninvasively. This also provides a way to synthesize individual tissue data sets with atlases such as Tabula Sapiens.
With our derived cell type specific gene profiles, we then assessed the signature score of these genes in diseased cfRNA. In chronic kidney disease, we found that we could noninvasively resolve tubular atrophy. We observed an elevated hepatocyte signature score in both non-alcoholic steatohepatitis (NASH) and NAFLD cohorts compared to a healthy cohort, where hepatocyte steatosis and subsequent cell death and is a histologic hallmark. In Alzheimer’s pathogenesis, marked by neuronal death and synaptic loss, we observed decreased neuronal signature score in AD and further observed differences in the signature scores of several glial cells. Furthermore, several differentially expressed genes in both the Alzheimer’s and NAFLD studies significantly intersected with our cell type specific gene lists for brain cell types and hepatocytes respectively, further underscoring the ability to noninvasively resolve cellular pathology that generalizes to numerous diseases.
This study underscores the value of working on two long lines of research and represents one of the first potential clinical applications emerging from single cell transcriptomic data. We’re optimistic that this is only the beginning – both for the unprecedented cellular resolution that we can achieve in noninvasive liquid biopsy as well as the for the clinical insights we can derive using single cell data and whole organism transcriptomic cell atlases.
Correspondence: Prof. Stephen Quake, firstname.lastname@example.org
Cover art: Rachel LaMantia