KnowDDI: Accurate and interpretable drug-drug interaction prediction enabled by knowledge subgraph learning

KnowDDI is a graph neural network that leverages biomedical knowledge graphs to enhance drug representations, addressing the challenge of drug-drug interaction (DDI) prediction in clinical treatments.
Published in Computational Sciences
KnowDDI: Accurate and interpretable drug-drug interaction prediction enabled by knowledge subgraph learning

Share this post

Choose a social network to share with, or copy the shortened URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Accurately predicting drug-drug interaction (DDI) can play an important role in the field of biomedicine and healthcare. On the one hand, combination therapies, where multiple drugs are used together, can be used to treat complex diseases and comorbidities, such as human immunodeficiency virus (HIV)[1]. On the other hand, DDI is an important cause of adverse drug reactions, which accounts for 1% hospitalizations in the general population and 2–5% hospital admissions in the elderly[2].

Identifying DDIs by clinical evidence such as laboratory studies is extremely costly and time-consuming[2]. In recent years, computational techniques, especially deep learning approaches, are developed to speed up the discovery of potential DDIs. Naturally, DDI fact triplets can be represented as a graph where each node corresponds to a drug, and each edge represents an interaction between two drugs. Provided with DDI fact triplets, a number of graph learning methods have been developed to identify unknown interactions between drug-pairs. Graph neural networks (GNNs)[3], which can obtain expressive node embeddings by end-to-end learning from the topological structure and associated node features, have also been applied for DDI prediction problem. However, known DDI fact triplets are rare due to the high experimental cost and continually emerging new drugs. This makes over-parameterized deep learning models fail to give full play to its expressive ability and may perform even worse than traditional two-stage embedding methods[4].

In biomedicine and healthcare, many international level agencies are endeavored to regularly maintain rich publicly available biomedical data resources. Researchers then integrate these disparate and heterogeneous data resources into knowledge graphs (KGs) to facilitate an organized use of information. These KGs, such as Hetionet[5], contain rich prior knowledge discovered in biomedicine and healthcare. Recent works merge the DDI network with external KGs as a combined network, extract enclosing subgraphs for different drug-pairs to encode the drug-pair specific information, and then predict DDI for the target drug-pair using the concatenation of nodes embeddings of drugs and subgraph embedding of enclosing subgraphs[6]. However, as these KGs integrate diverse data resources by automated process or experts, existing methods fail to filter out noise or inconsistent information. As a result, properly leveraging external KGs is still a challenging problem.

We propose KnowDDI (Fig. 1), an accurate and interpretable method for DDI prediction. First, we merge the provided DDI graph and an external KG into a combined network, upon which generic representations for all nodes are learned to encode the generic knowledge. Next, we extract a drug-flow subgraph for each drug-pair from the combined network. We then learn a knowledge subgraph from generic representations and the drug-flow subgraph. After optimization, the representations of drugs are transformed to be more predictive of the DDI types between the target drug-pair. In addition, the returned knowledge subgraph contains explaining paths to interpret the prediction result for the drug-pair, where the explaining paths consist of only edges of important known DDIs or newly-added edges connecting highly similar drugs. In other words, the learned knowledge subgraph helps filter out irrelevant information and adds in resembling relationships between drugs whose interactions are unknown. This allows the lack of DDIs to be implicitly compensated by the enriched drug representations and propagated drug similarities.

On a combined network which merges the drug-drug interaction (DDI) graph with an external knowledge graph (KG), generic embeddings of all nodes are firstly learned to capture generic knowledge. Then for each target drug- pair, a drug-flow subgraph is extracted from the combined network, whose node embeddings are initialized as the generic embeddings. Via propagating drug resembling relationships, the generic embeddings are transformed to be more predictive of the DDI types between the drug-pair, and the drug-flow subgraph is adapted as knowledge subgraph which contains explaining paths to interpret the prediction result.
Fig. 1 Overview of KnowDDI.

In this study, we perform experiments on two publicly available benchmark DDI datasets: (i) Drugbank[7] is a multiclass DDI prediction dataset consisting of 86 types of pharmacological relations occurred between drugs; and (ii) TWOSIDES[8] is a multilabel DDI prediction dataset recording multiple DDI side effects between drugs. We adopt Hetionet[5], which is a benchmark biomedical KG for various tasks within drug discovery, as the external KG in this paper. We perform extensive experimental results on benchmark datasets, and observe that KnowDDI consistently outperforms existing works. We also conduct a series of case studies which further show that KnowDDI can discover convincing explaining paths (Fig.2) which help interpret the DDI prediction results.

Fig. 2 A visualization of subgraph and explaining paths learned by KnowDDI.

In this study, we are motivated to develop an effective solution to accurately construct a DDI predictor from the rare DDI fact triplets. The proposed KnowDDI achieves the goal by taking advantage of rich knowledge in biomedicine and healthcare and the plasticity of deep learning approaches. The architecture of KnowDDI can be further improved. For instance, pretraining GNN from other large datasets which may provide better initialized parameters and therefore reduce the training time. Besides, we do not use any molecular features of drugs in order to test the ability of KnowDDI learning solely from the combination of external KG and DDI fact triplets. Taking these predefined node features may improve the predictive performance of KnowDDI in the future. Although we implement KnowDDI to handle DDI prediction in this paper, KnowDDI is a general approach which can be applied to other relevant applications, to help detect possible protein–protein interactions, drug–target interactions, and disease-gene interactions. Relevant practitioners can easily leverage the rich biomedical knowledge existing in large KGs to obtain good and explainable prediction results. We believe our open-source KnowDDI can act as an original algorithm and unique deep learning tool to promote the development of biomedicine and healthcare. For example, it can help detect possible interactions of new drugs, accelerating the speed of drug design. Given drug profiles of patients, KnowDDI can be used to identify possible adverse reactions. These results have the potential to serve as a valuable resource for alerting clinicians and healthcare providers when devising management plans for polypharmacy, as well as for guiding the inclusion criteria of participants in clinical trials. Beyond biomedicine and healthcare, similar approaches can be developed to adaptively leverage domain-specific large KGs to help solve downstream applications in low-data regimes.


[1] Bangalore S, Kamalakkannan G, Parkar S, et al. Fixed-dose combinations improve medication compliance: a meta-analysis[J]. The American journal of medicine, 2007, 120(8): 713-719.

[2] Jiang H, Lin Y, Ren W, et al. Adverse drug reactions and correlations with drug–drug interactions: A retrospective study of reports from 2011 to 2020[J]. Frontiers in Pharmacology, 2022, 13: 923939.

[3] Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks[J]. Bioinformatics, 2018, 34(13): i457-i466.

[4] Yao J, Sun W, Jian Z, et al. Effective knowledge graph embeddings based on multidirectional semantics relations for polypharmacy side effects prediction[J]. Bioinformatics, 2022, 38(8): 2315-2322.

[5] Himmelstein D S, Baranzini S E. Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes[J]. PLoS computational biology, 2015, 11(7): e1004259.

[6] Yu Y, Huang K, Zhang C, et al. SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization[J]. Bioinformatics, 2021, 37(18): 2988-2995.

[7] Wishart D S, Feunang Y D, Guo A C, et al. DrugBank 5.0: a major update to the DrugBank database for 2018[J]. Nucleic acids research, 2018, 46(D1): D1074-D1082.

[8] Tatonetti N P, Ye P P, Daneshjou R, et al. Data-driven prediction of drug effects and interactions[J]. Science translational medicine, 2012, 4(125): 125ra31-125ra31.



Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Machine Learning
Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning
Mathematics and Computing > Computer Science > Computer and Information Systems Applications > Bioinformatics
Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence

Related Collections

With collections, you can get published faster and increase your visibility.

Liquid biopsy

This Collection welcomes clinical and translational research on liquid biopsy approaches in cancer.

Publishing Model: Open Access

Deadline: Aug 13, 2024

Advances in MASLD/NAFLD

This Collection welcomes clinical, translational, epidemiological, and public health research focused on MASLD/NAFLD.

Publishing Model: Open Access

Deadline: Jun 30, 2024