A leap in to systems biology

The bottom-up 'omics' and the top-down modeling approach for integrative physiology and pathology with implications in drug discovery – a data science perspective.
A leap in to systems biology

In this leap month, I am expanding on my previous note about the underlying rationale of drug discovery and personalized medicine, to bring together two key concepts in the field; systems biology intertwined with -omics data. In short, systems biology can successfully delineate new biological mechanisms and drive translational advances in the field of medicine.

Having previously described the fundamentals of drug discovery from a purely interdisciplinary perspective of synthetic chemistry, molecular biology, and biochemistry, this story expands on big data in drug discovery; to investigate existing datasets of clinical and preclinical transcriptomics, metabolomics, imaging, and other data types, to experimentally achieve successful drug-candidate combinations [Bielekova 2014, Azer 2023]. The data science-driven drug discovery arch includes five key steps to form a roadmap: consisting of the following stages (Figure 1):  

  • Discovery – to characterize the mechanism of disease and identify potential mechanisms of reversing disease biology to restore health.
  • Priority – ranking the mechanism of disease against the drug candidate and predictions of its mechanism of action.
  • Design – a step that entails the selection and confirmation of drug candidates with the highest potential to affect the intended mechanism-of-action.
  • Optimization – to find the optimal composition, component ratios, and the dose to achieve the maximum effect of treatment relative to clinical studies.
  • Translation – developing a clinical path to inform clinical study design and include strategies of pharmacology to validate drug efficacy in the clinic.
Figure 1: Systems biology as a drug discovery and development engine [Azer 2023].  

Attempts to understand the mechanism-of-action of drugs using data science includes gathering mechanistic data from multiple sources, including scientific literature, databases, and pathway maps [Wang 2020]. The roadmap of a newly developed drug moiety as it proceeds to finally reach a primary target or pathway of interest, is rife with several physicochemical properties.

For instance, investigators must consider the size, shape, and polarity of the compound, its chemical stability, metabolic stability, clearance rate, bioavailability, half-life, and off-target effects (Figure 2).  Although traditional drug discovery originated with the intention of studying 'one-drug and one-target' interactions, data science has continued to accumulate evidence to the fact that medicinal compounds can generally perform by affecting the entire related biological pathway, instead of affecting a single target alone. 

By mining the relevant data sources, scientists can identify existing molecular mechanisms and drug targets for a given disease, to then test new small molecule drug candidates against the same biological targets to observe similar, if not more personalized outcomes that can be optimized to suit precision medicine efforts [Chen 2016]. This is a brief and simple post about the anatomy of systems biology that primarily links the complexities of experimental biology to the intricacies of the drug discovery pathway through -omics, data science, and mathematical/computational modeling.

Figure 2: Deciphering drug likeness based on the structure of the compound, its properties, and profile [Novartis].

A quick recap - machine learning in drug discovery

The more pedantic aspects of the field of data science include literature mining approaches such as natural language processing, keyword-based methods, semantic-based and ontology-based methods that reveal the molecular mechanisms of interest in a disease, to match the outcome with potential drug targets with increased accuracy [Wang 2020]. Existing studies have outlined machine learning approaches to integrate data from multiple such sources to build disease networks for large-scale datasets that rely on experimental outcomes, alongside historical clinical data, and existing biomarkers of disease [Kanehisa 2017] (Figure 3).

Figure 3: The flowchart of a disease-combined machine learning framework named LSA-PU-KNN [abbreviation for (latent specific analysis) – (positive unlabeled) – (k nearest neighbor framework)] to predict potential drug-pathway associations [Wang 2020].

With machine learning methods, drug-pathway associations can be predicted by fusing multiple features. For instance, drug features can be divided according to their similarity in drug chemical structure, and relative to their molecular functional groups. Additional frameworks can predict potential drug-pathway associations, to combine features of

  • drug-drug similarity,
  • drug-disease associations, and
  • pathway-pathway similarity, alongside
  • pathway-disease associations, and
  • pathway related gene expression, to clearly delineate a drug-pathway pair [Wang 2020].

Such networks can be overlayed into several pathways to obtain additional information to unveil the potential mechanistic drug pathway of a novel drug, through mechanistic modeling approaches [Bai 2019]. A variety of such related drug-pathway associations, and databases are highlighted on table 1 [Wang 2020].  

Table 1: Drug pathway associations via databases and web associations [Wang 2020].

Drug discovery – prioritizing traditional pathways and creating modern maps.

Computational workflows can provide a boost to accrue big data, with semi-automated and efficient analysis to identify potential drug molecules that can reverse components of the disease mechanistic pathway [Denaro 2023]. For instance, studies in mathematical biosciences have shown the capacity to develop a pipeline to test the mechanism-of-action of four combined drugs for tuberculosis treatment [Denaro 2023]. Such processes can benefit from quantitative systems pharmacology, and physiologically based pharmacokinetics [Ashfaq 2022] to effectively replace animal models and increase the accuracy of analysis, while reducing costs in the field of medicine.

Figure 4: Evaluating the capacity to reuse drugs across databases to build a platform of common therapies for drug repurposing based on genes, and for drug repurposing based on biological pathways [Otero-Carrasco 2023].

The analytical methods can also provide insights to the mechanism of action of a new therapeutic component developed by integrating human genetics, clinical precedence, and preclinical experimental models, to carry out clinical trials, and ultimately obtain FDA approval for a new drug compound. Recent efforts in systems biology transcend the ‘one-drug-one-target-one-disease’ paradigm to more broadly impact biological pathways.

This is highlighted by the mechanism of ‘drug repurposing based on biological pathways (DREBIOP)’ a module that offers distinctive pattern discovery among therapeutic agents, to repurpose drugs and build a medical database (Figure 4). Similarly, Gene2Drug is another computational tool for effective pathway-based drug repurposing [Napolitano 2018].

Engineering biomechanisms from the bottom-up

All computational models in biology are based on solid mathematical foundations that benefit from the available omics-data to advance biochemical system modelling, to understand unknown variables of multiple pathological pathways. These frameworks can therefore assist the workflow of a conceptualized biomechanical pathway, to ‘bottom-up engineer’ pathophysiology, based on experimental outcomes observed at first in the lab and the clinic. In the perspective of data science – to bottom-up engineer a pathological pathway for clinical translation entail:

  • gathering the project-specific principles,
  • output prioritization, and
  • detailing the biology of interest.
  • the strategy represents cross-functional efforts between interdisciplinary research areas to create a clinical platform that also predicts drug candidates and their mechanisms of action relative to the biological pathway (Figure 5).

Preclinical and clinical data and computational models can facilitate the clinical development phase of pathway-related drugs to advance the understanding of several mechanisms of action, including -omics data integration to obtain additional insights into multicellular systems [Butcher 2004][Kiyosawa 2016].

Figure 5: The development cycle of integrated in silico models using component level and system response data, ideally based on complex human cell-based assays conducted in the lab. The component level ‘omics’ data provides a scaffold [Butcher 2004].

An evolving pharmacological landscape for drug design

Thus far it is clear that the advent of omics-related molecular datasets at the genomics, transcriptomics, proteomics, and metabolomics-level can lay the groundwork to decode dynamic biological networks, to understand mechanisms of disease via a series of hypothesis-led experiments in the lab.  However, human biology is inherently complex and pathological perturbations can lead to even more complexity, which eventually require a systemic and personalized approach towards individualized treatment [Azer 2023].

When mining for disease-related data, therefore, databases offer a first step to reproduce and translate preclinical models to human pathology.   As a bioinformatics tool, advanced mathematical models can study biological systems for applications in large-scale preclinical and clinical datasets, to design successful clinical biomarkers, to evolve well into the translational and clinical space (Table 2) [Wang 2020].

These efforts in total can facilitate the enrollment of the correct patient subset from heterogeneous patient populations to detect drug activity and disease mechanisms, to optimize endpoints and clinical outcomes for decision-making among at-risk groups, as seen with patients with neurological disorders [Siderowf 2023]. Using systems pharmacology, it is possible to carryout dynamic simulations of drug responses to improve the prediction of drug toxicities in the clinic, beyond the constraints of conventional preclinical toxicology studies.

Table 2: List of different types of computational methods [Wang 2020].

Roadmap to the clinic: optimization and clinical translation of pharmacology efficiency

The process of designing and developing drug combinations to treat rare diseases as well as oncology is both clinically and experimentally demanding. Mathematical modeling and quantitative systems pharmacology offer platforms to determine the chance of success in advance, for personalized treatment strategies in an actual clinical environment [Azer 2023]. These efforts can facilitate the optimization of individual concentrations of drug doses across multiple preclinical models to generate robust data.

Experimental models can also be developed to evaluate the impact of genetic variants and epigenetic modifications, relevant to the dose and therapeutic effect of a drug – for subsequent clinical translation. Successful clinical translation results in the development of a clinical path, to inform clinical study design, and to develop biomarker strategies to validate pharmacological efficiency in the clinic [Yates 2020] (Figure 6).  

Data science-based drug discovery and bioengineering efforts predominantly aim to de-risk the clinical development process and to repurpose drugs to treat complex diseases in heterogenous disease populations [Otero-Carrasco 2023]. I previously highlighted this trend with a classic example of metformin repurposed beyond its antidiabetic properties, to treat oxidative stress, injury, and inhibit the growth and migration of renal carcinoma cells via pathway-based repurposing scenarios [Liu 2022, Otero-Carrasco 2023].

More recent investigations have shown the cost-effective potential of repurposing the asthma drug Omalizumab to reduce multiple food allergies by attacking IgE antibodies that are universally upregulated during an immune response [Wood 2024]. And of cancer-targeting antibody-drug conjugates for precision cancer therapies.

Figure 6: Approaches to systems biology in biotechnology and pharmaceuticals. The omics (bottom-up approach) is focused on identifying global measurements of molecular components. The modeling (top-down approach) attempts to form integrative models of human physiology and diseases across several scales to generate complex cell systems [Butcher 2004].

Outlook - connecting networks. 

Data science has in this way revolutionized investigations for clinical product development in the past two decades to broadly encompass multidisciplinary areas of healthcare; from neurological diseases, to cardiovascular, metabolic, oncology, renal, and rare disease types. The underlying biological system is complex and therefore the capacity to repurpose drugs and administer combinatorial therapies have heralded a proactive systems biology platform for cost-effective outcomes to advance the future of healthcare.

The aim of combining systems biology and -omics data centered platforms is primarily to unveil the mechanism-of-action of a newly developed drug compound, or to efficiently design a drug compound by analyzing a specific drug-pathway pair.  The systems biology-integrated vision also extends to decoding complex and rare diseases with intricate biological networks, to bioengineer the pathways essentially from the bottom-up, to shed light on disease mechanisms for personalized therapy. With aims to cost-effectively repurpose medicine, and to highlight the necessity of efficiently developing innovative medicines to impact hitherto unmet medical needs.

Header Image: Drug development news stock image via - STEM Education and Training Builds Diversity Among Next Generation of Biomedical Scientists. 


  1. Bielekova B. et al. How implementation of systems biology into clinical trials accelerates understanding of diseases, Frontiers Neurology, 2014.
  2. Azer K. et al. Systems biology platform for efficient development and translation of multitargeted therapeutics, Frontiers Systems Biology, 2023.
  3. Wang C. et al. Drug-pathway association prediction: from experimental results to computational models, Briefings in Bioinformatics, 2021.
  4. Chen X. et al. Drug-target interaction prediction: databases, web servers and computational models, Briefings in Bioinformatics, 2016.
  5. Kanehisa M. et al. KEGG: new perspectives on genomes, pathways, diseases and drugs, Nucleic Acids Research, 2017.
  6. Bai J. et al. Translational Quantitative Systems Pharmacology in Drug Development: from Current Landscape to Good Practices, The AAPS Journal, 2019.
  7. Denaro C. et al. A pipeline for testing drug mechanism of action and combination therapies: From microarray data to simulations via Linear-In-Flux-Expressions: Testing four-drug combinations for tuberculosis treatment, Mathematical Biosciences, 2023.
  8. Ashfaq U. et al. Computational approaches for drug-metabolizing enzymes: Concepts and challenges, Biochemistry of Drug Metabolizing Enzymes, 2022.
  9. Napolitano F. et al. gene2drug: a computational tool for pathway-based rational drug repositioning, Bioinformatics, 2018.
  10. Butcher E. et al. Systems biology in drug discovery, Nature Biotechnology, 2004.
  11. Kiyosawa N. et al. Data-intensive drug development in the information age: applications of Systems Biology/Pharmacology/Toxicology, The Journal of Toxicological Sciences, 2016.
  12. Siderowf A. et al. Assessment of heterogeneity among participants in the Parkinson's Progression Markers Initiative cohort using α-synuclein seed amplification: a cross-sectional study, The Lancet, Neurology, 2023.
  13. Yates J. et al. Opportunities for Quantitative Translational Modeling in Oncology, Clinical Pharmacology and Therapeutics, 2020.
  14. Otero-Carrasco B. et al. Identifying patterns to uncover the importance of biological pathways on known drug repurposing scenarios, bioRxiv, 2023.
  15. Liu Y. et al. High-concentration Metformin reduces oxidative stress injury and inhibits the growth and migration of clear cell renal cell carcinoma, Computational and Mathematical Methods in Medicine, 2022.
  16. Wood R. et al. Omalizumab for the Treatment of Multiple Food Allergies, NEJM, 2024.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Life Sciences > Biological Sciences > Biotechnology
Data Science
Mathematics and Computing > Computer Science > Data Structures and Information Theory > Data Science
Mathematical Biology
Mathematics and Computing > Mathematics > Applications of Mathematics > Mathematical Biology
Systems Biology
Life Sciences > Biological Sciences > Biological Techniques > Biological Models > Systems Biology
Drug Development
Physical Sciences > Chemistry > Biological Chemistry > Pharmaceutics > Drug Development