Spatial transcriptomics technologies, named “Method of the Year 2020”, have undergone rapid development in recent years. They are used to profile spatial locations of all detected mRNAs, providing a new perspective for biologists seeking to understand cells per se as well as their microenvironments. Broadly, spatial transcriptomics technologies can identify undiscovered transcriptional patterns and reconstruct transcriptional panoramas of whole tissues. On a fine-grained level, these technologies can be used to explore the interactions among neighboring cells and intracellular and extracellular states, which helps redefine the function of cells and improves our knowledge of diseases. The current spatial transcriptomics technologies can be mainly classified into two categories. The first category is image-based technologies, including in situ sequencing- and in situ hybridization-based methods, which can profile mRNA with high spatial resolution, especially at the subcellular level. However, limitations such as the low number of profiled genes, low sensitivity of mRNA detection, and time-consuming processes impede the broad application of image-based technologies. The second category is sequencing-based spatial transcriptomics technologies, which capture position-barcoded mRNA with non-gene-specific probes. These technologies can profile the whole transcriptome of tissue sections of any size, and are more user-friendly and less time-consuming than image-based technologies. Moreover, spatial transcriptomics technologies are highly applicable and have been used to improve our understanding of various species, organs, and tissues, including the brain, liver, and tumors.
One critical issue related to sequencing-based spatial transcriptomics technologies is low-resolution spots containing multiple cells with several blended cell types, which can conceal the genuine transcriptional pattern and lead to biological misunderstanding of the tissue resulting in the distorted cellular-level reconstruction of the tissue. An important task, therefore, is to quantify the proportion of all cell types among captured spots, so-called cellular deconvolution. Following deconvolution, all captured spots can be used to better understand intercellular functions and recover the fine-grained panorama of a heterogeneous tissue.
In the present study, we conducted a comprehensive benchmarking and provided guidelines for the cellular deconvolution of spatial transcriptomics data. Specifically, we evaluated 18 existing computational methods with 50 simulated and real-world datasets by comprehensively testing the accuracy, robustness, and usability of the methods. These methods could be broadly classified as those with and without scRNA-seq references. Based on their computational techniques, we grouped the methods as follows: probabilistic-based, non-negative matrix factorization-based (NMF-based), graph-based, Optimal-transport (OT)-based and deep learning-based methods. During benchmarking, we used multiple metrics and various data resources with different spatial transcriptomics techniques, spot resolutions, gene numbers, spot numbers, and cell types to ensure our assessment was comprehensive and to deepen our understanding of cellular deconvolution methods.
In addition to the quantification and visualization processes, decision-tree-style guidelines were produced, which included the refinement of the benchmarking results and the collection of respective additional features of the methods detailed in related publications. These guidelines recommend scenario-specific methods for users considering computational efficiency and the characteristics of data resources. The general limitations and future perspectives associated with cellular deconvolution are also discussed to give users a clear picture of the cellular deconvolution field and thus facilitate the improvement of tools for the community.
For more details of our work, please see the original article: "A comprehensive benchmarking with practical guidelines for cellular deconvolution of spatial transcriptomics" in Nature Communications.