Disease networks for everyone!

After almost every talk I give, there’s always a moment during the Q&A when someone asks, “Amazing dataset! Can you share it?” I know what’s coming next—the part where I have to say no. The engineer in me would love to share everything: the data, the code, and the entire pipeline that led to our results. In an ideal world, science thrives on openness, and I firmly believe in transparency and reproducibility. But the reality is different. Medical data is sensitive, locked behind layers of security, and for good reason—it belongs to real people. So, instead of an enthusiastic “Yes!” I have to say, “Unfortunately, I can’t share the data, but I’d be happy to check something in our data for you.”
Throughout my PhD, this dilemma followed me. While I was busy thinking about impactful research questions, another question constantly occupied my mind: How do we balance data privacy with the core principles of open science? How do we make medical research more reproducible when the data can’t be freely shared? It was a problem I couldn’t ignore—an obstacle between me and the kind of transparency I wanted to bring to my work.
I had already worked on several side projects—analyzing comorbidity networks of mental disorders, mapping the trajectories of depressed patients, and constructing multilayer comorbidity networks. I shared insights and results with collaborators, but the raw data remained locked away and inaccessible beyond my own research group. Then came a request—colleagues from the University of Belgrade reached out, asking if they could use our multilayer comorbidity networks to test a new method for extending subgraphs. My first instinct was hesitation. But actually, sharing these networks didn’t mean sharing private medical data. A comorbidity network isn’t a patient record—it’s a matrix of relationships, a structure that reveals how diseases connect without exposing individual details. And just like that, I had my aha! moment. I could share valuable datasets for research while fully respecting privacy laws.
However, previous research showed me that constructing these networks depends on the research question, with no standard approach. How they are built affects their interpretability and applicability. Over time, key methods have emerged through many studies—methods that effectively measure, combine, and analyze these networks across a wide range of applications. This experience has now allowed us to create a research-ready dataset, allowing others to work with comorbidity networks without needing deep expertise in building them from scratch.
Comorbidity networks, or disease-disease networks, represent relationships between diseases. These networks have become essential tools in network medicine, enabling a more systemic understanding of diseases. Graph-based representations are increasingly vital in AI, particularly in graph neural networks. By abstracting disease connections into networks, we enable research without compromising sensitive patient information.
These networks are the engines that drive much of our research on aging and multimorbidity. We recently used them to identify critical events in life-spanning disease trajectories and to develop an epidemiological model to forecast multimorbidity in the population. And we believe this is just scratching the surface of what can be done with these networks. That is why we are thrilled to announce the release of a research-ready dataset of various types of comorbidity networks, constructed from nationwide hospitalization data covering 45 million hospital stays over 17 years. This dataset is designed to fuel new discoveries in public health, network medicine, and AI-driven healthcare research.
To make these networks even more accessible, we’ve also developed an interactive web app where you can explore comorbidity networks with different properties. Check it out!
The synergy between network science, medicine, and AI is shaping the future of healthcare by making disease prediction and treatment more data-driven and efficient. We believe this dataset can drive a wide range of #PublicHealth and biomedical research questions. We encourage you to explore it and see where it takes you.
Follow the Topic
-
Scientific Data
A peer-reviewed, open-access journal for descriptions of datasets, and research that advances the sharing and reuse of scientific data.
Related Collections
With collections, you can get published faster and increase your visibility.
Data structures and ontologies for clinical and medical research
Publishing Model: Open Access
Deadline: Apr 22, 2025
Neuroscience data to understand human behaviour
Publishing Model: Open Access
Deadline: Apr 30, 2025
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in