A community-based protocol for the statistical analysis of non-targeted metabolomics data

Non-targeted metabolomics is distinguished from its targeted counterpart by its exploratory nature, which aims to capture the entire spectrum of small molecules present in a sample. This approach typically generates large, complex datasets that require sophisticated analysis tools to identify and interpret the relevant chemical signatures that reflect underlying biological processes. Several tools and platforms have been developed to aid in this process, notably Feature-based Molecular Networking (FBMN) within the Global Natural Products Social Molecular Networking (GNPS) metabolomics cloud ecosystem (https://gnps2.org/). FBMN has become a cornerstone in metabolomics research, enabling researchers to annotate and connect features across samples. However, the subsequent statistical analysis of these features has remained a significant roadblock, particularly for those who are not experts in computational methods. The fragmented nature of available tools, scattered across different platforms and requiring customized scripts, adds to the challenge, especially for newcomers to the field. The need for a comprehensive, user-friendly guide that integrates multiple statistical approaches into a cohesive analysis pipeline became increasingly apparent.
To address these challenges, we developed a detailed protocol that guides researchers through the entire process of analyzing FBMN results. This protocol, designed to be an end-to-end solution, begins with feature detection and continues through data clean-up, statistical analysis, and spectrum annotation. By providing ready-made code for the popular statistical platforms R and Python, as well as a graphical user interface (GUI), we aimed to make the tool kit accessible to a wide range of users. The protocol is fully integrated with FBMN, and the input files can be directly loaded from GNPS , ensuring seamless workflow compatibility. For users who prefer a more interactive approach, we developed a web application with a GUI, available both online (https://fbmn-statsguide.gnps2.org/) and as a downloadable application (https://www.functional-metabolomics.com/resources). This tool is designed not only for experienced researchers but also for educational purposes, making it an ideal resource for students and early-career scientists.
This protocol was developed with the support of the Virtual Multiomics Lab (VMOL), a community-driven, open-access virtual laboratory (https://vmol.org/). Initiated in 2022, this project aims to democratize access to non-targeted metabolomics analysis strategies, workflows, and expertise, making computational mass spectrometry accessible to researchers worldwide, regardless of their background or resources.
The Role of Virtual Labs in Democratizing Computational Metabolomics
The development of this protocol was initiated during a summer school for non-targeted metabolomics at the University of Tuebingen in 2022, for which we had developed a series of R notebooks for the statistical analysis of metabolomics results. During this summer school we further launched a virtual working group, which we called the the Virtual Multiomics Lab. VMOL is an open initiative that seeks to break down the barriers to scientific collaboration and education, in which this protocol and ultimately this paper were further developed.
VMOL is open to everyone and connects computational biologists, chemists, and bioinformaticians from around the world in a virtual research group. By removing physical and economic barriers, VMOL provides training in computational mass spectrometry and bioinformatics/data science and launches virtual research projects as a new form of collaborative science. A central component of VMOL is its mission to train any interested researcher in mass spectrometry and computational metabolomics (regardless of background, circumstance, geographical location, etc.). The emphasis on inclusivity and accessibility is central to VMOL’s mission. The initiative recognizes that while diversity is valued in the scientific community, economic barriers often prevent many researchers from participating in critical events such as conferences and workshops. These events are vital for exchanging ideas, fostering collaborations, and creating opportunities, but they remain out of reach for many due to the costs involved.
Towards a More Inclusive Scientific Community
Follow the Topic
-
Nature Protocols
This journal publishes secondary research articles and covers new techniques and technologies, as well as established methods, used in all fields of the biological, chemical and clinical sciences.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in