Generative Artificial Intelligence for Drug Discovery: How the First AI-Discovered and AI-designed Drug Progressed to Phase 2 Clinical Testing

Here, we summarize and contextualize the findings of a recent Nature Biotechnology publication by Insilico Medicine. We discuss the multimodule generative AI pipeline and robotics underlying the discovery, design, development, and in-human clinical testing of INS018_055 TNIK inhibitor
Generative Artificial Intelligence for Drug Discovery: How the First AI-Discovered and AI-designed Drug Progressed to Phase 2 Clinical Testing

Share this post

Choose a social network to share with, or copy the shortened URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Before You Read This 

This paper contains over three years of multidisciplinary experimental preclinical and clinical research and the AI component, which we had to tone down per reviewers' comments. Please check the Behind the Paper articles covering the development of Insilico's Biology42: PandaOmics and Chemistry42: Generative Chemistry platforms (both are commercially available community resources with multiple papers and application notes). And to appreciate the journey we took from the Generative Tensorial Reinforcement Learning (GENTRL) paper submitted in 2018 and published in 2019 in Nature Biotech to this TNIK paper, submitted in 2023 and published in 2024, please check out the GENTRL Behind the Paper article.  

To facilitate easier review and to answer your questions, we developed a tool for immediate Q&A, which also allows you to see the paragraph of the paper the answer is based on, as well as primary references. The tool is available at . Please use it to talk to the paper and to other papers. On a side note, this may be one of the industry's firsts, as we submitted this paper with this conversational tool back in June 2023 when ChatGPT did not provide substantial embedded functionality. 

Transforming Traditional R&D

Traditional drug discovery and development can be segmented into three major phases: hypothesis/target discovery to lead optimization, preclinical candidate assessment, and clinical testing. This process is a substantial investment of time and financial resources, often lasting decades and accumulating costs of up to $2.6B. A categorical perspective of this process is presented in Figure 1, an adapted overview from a seminal paper by Paul et al. that highlighted the challenges facing industrial R&D nearly 13 years ago.  

Figure 1: Traditional drug R&D is a timely and costly initiative that requires transformation
Figure 1: Traditional drug R&D is a timely and costly initiative that requires transformation

This prescient overview remains relevant in summarizing many of the difficulties researchers face today. However, one significant advancement in streamlining this process has been the remarkable advances in artificial intelligence (AI) research. Advances in generative adversarial networks (GANs), including our group’s influential development of the generative tensorial reinforcement learning model (GENTRL) for de novo small-molecule design, have begun to streamline the discovery process. Unlike traditional AI, which relies on analyzing existing data and statistical methods to predict outcomes, generative AI (Figure 2) in drug design, for example, can generate novel molecular structures. Conceptually, GENTRL prioritizes the structures it generates by optimizing three key factors: synthetic feasibility, novelty, and biological activity. A concise summary of this paradigm is summarized in the original research community submission following GENTRL’s publication in Nature Biotechnology. This article aims to present a primer on the multimodule generative AI pipeline that drove the discovery, development, and in-human clinical testing of INS018_055, the first AI-developed drug to enter clinical testing. Here, we will highlight our significant findings, their context in idiopathic pulmonary fibrosis (IPF) research, and how AI-led drug discovery offers hope to patients battling IPF and other unmet clinical needs.

Figure 2: A timeline of generative adversarial networks for drug discovery.

Figure 2: A timeline of generative adversarial networks for drug discovery. 

The AI Pipeline Driving Discovery Progress

At the heart of our group’s transformative study is the AI-augmented pipeline that modernized the discovery phase of our research efforts, a tandem effort between PandaOmics and Chemistry42. Idiopathic pulmonary fibrosis (IPF) is a generally lethal lung disease that is driven by fibroblast proliferation and significant extracellular matrix deposition that impair lung function. In untreated patients, IPF has a highly inconsistent clinical course with a median survival time of 2-3 years, and its incidence is growing at an alarming rate. Less than 30% of IPF patients benefit from currently approved treatments such as corticosteroids and two pan-tyrosine kinase inhibitors, making novel therapies in this area an outstanding unmet clinical need. To build the initial hypothesis, our group utilized the target discovery module  PandaOmics and trained it on a collection of omics and clinical datasets related to tissue fibrosis, annotated by age and sex. Target selection was accomplished using a sophisticated gene and pathway scoring system derived from our group’s PandaOmics predecessor, iPANDA (in silico pathway activation network decomposition analysis), which we published in Nature Communications in 2016. PandaOmics identified relevant targets by analyzing deep feature synthesis, causality inference, and de novo pathway reconstruction to generate a list of targets. A natural language processing (NLP) engine assessed target novelty and disease association scoring, which analyses data from millions of data files, including patents, publications, grants, and clinical trial databases. Thus, PandaOmics revealed 20 targets for validation, and one novel intracellular target, TRAF2- and NCK-interacting kinase (TNIK), was selected for a thorough investigation. TNIK was identified as the number one target, and previous literature had tangentially linked it to multiple fibrosis-driving pathways, namely the wingless/integrated (WNT), transforming growth factor-beta (TGF-β), Hippo, c-Jun N-terminal kinase (JNK), and nuclear factor kappa B (NF-κB) signaling cascades. However, TNIK had never been studied as a therapeutic target in IPF and was thus highly ranked by the AI algorithm. TNIK association with IPF was validated using the single-cell gene expression datasets from healthy lung and fibrotic lung tissue from IPF patients, with which we confirmed its enrichment in fibrotic tissue. A timeline of TNIK biology before its proposal as a potential target for IPF in our study to the time of this publication is presented in Figure 3.

Figure 3: A timeline of events in the identification, investigation, and targeting of TNIK. Major milestones by our group are highlighted relative to the seminal discoveries in TNIK biology.

Figure 3: A timeline of events in the identification, investigation, and targeting of TNIK. Major milestones by our group are highlighted relative to the seminal discoveries in TNIK biology.

Next, we utilized our generative chemistry module, Chemistry42, to design a drug that could safely, specifically, and efficiently inhibit TNIK function. Chemistry42 uses a structure-based drug design (SBDD) workflow, which was enlisted to generate a library of virtual structures. Thirty generative models were employed in parallel to produce compound structures, after which our scientists provided feedback on optimizing this virtual screening. Following multiple iterative screens, the TNIK ATP-binding site was selected as the target binding pocket. Initially called ISM001, this promising lead demonstrated activity with a nanomolar half-maximal inhibitory concentration (IC50) value against TNIK. This de novo compound generation step further optimized ISM001 to increase solubility, promote a good ADME safety profile, and retain its remarkable inhibitory activity against TNIK, thus producing INS018_055. Importantly, our research team turned a hypothesis-driven theory into a convincing reality by successfully treating multiple animal models of fibrosis with INS018_055. INS018_055, oral or inhaled treatment of mice and rats with induced pulmonary fibrosis attenuated fibrotic progression by reducing fibroblast activation, reducing the deposition of fibrotic proteins, and attenuating lung inflammation, thus improving lung function. Furthermore, INS018_055 demonstrated pan-fibrotic inhibitory function, attenuating skin and kidney fibrosis in two additional in vivo models. This observation emphasizes the potential indication expansion opportunities for this molecule and will be an area of active research for many groups in the future. Following the conclusion of these collective studies, INS018_055 achieved preclinical candidate nomination in early 2021, roughly 18 months following PandaOmics's proposal of TNIK as a druggable target for IPF in 2019.  

INS018_055: From Preclinical Candidate Nomination to Clinical Trials

In November of 2021, just nine months after its PCC nomination, our group announced that the first healthy volunteers were dosed in a first-in-human (FIH) microdose trial of INS018_055 in Australia (ACTRN12621001541897). This trial in 8 healthy volunteers exceeded expectations and demonstrated INS018_055's favorable pharmacokinetic and safety profiles. These data are presented within the current manuscript and set the stage for the next step of clinical testing. Importantly, this success showed a massive milestone for our team and the broader generative AI field: from novel target discovery to Phase 1 testing in under 30 months. Relative to traditional drug discovery methodology, this AI-powered approach accomplished this feat in half the time and for a fraction of the cost.

 Following these positive results, our group collaborated with researchers in New Zealand to conduct a randomized, double-blind, placebo-controlled Phase 1 clinical trial (NCT05154240) to evaluate the safety, tolerability, and pharmacokinetic properties of INS018_055 in 78 healthy volunteers. This Phase 1 study was initiated in February 2022, and the final follow-up visit was completed in November 2022. Healthy volunteers were enrolled in ten cohorts of 5 single ascending doses (SAD) and 3 multiple ascending dose (MAD) sub-cohorts with another 14 healthy volunteers in the drug-drug interaction (DDI) study to determine the maximum tolerated dose and establish dosage guidelines for future Phase 2 studies. In January of 2023, our clinical collaborators concluded that the human pharmacokinetic data of INS018_055 was in total agreement with our research group's preclinical modeling, with no undesirable drug accumulation following a seven-day post-treatment period. Furthermore, INS018_055 was generally safe and well tolerated, with no serious adverse events or mortality reported in any healthy participants enrolled in the trial. All treatment-related adverse effects in SAD and MAD cohorts were of mild severity, resolved by the study's end, and comprehensively documented in our published work. A second Phase 1 study on a different population in China (CTR20221542) concluded similar findings. The two Phase 1 studies agreed that INS018_055 is safe, well-tolerated, and possesses good oral bioavailability and dose-proportional pharmacokinetics in healthy volunteers. This incredible achievement and complete body of work reflect all the data in our current publication and the astonishing power that generative adversarial networks have imparted to the drug discovery process. However, the story of INS018_055 continues.

A Look into the Future of INS018_055 and AI-powered Drug Discovery

 At the time of this publication, two Phase 2a trials using INS018_055 (NCT05975983 and NCT05938920) for treating IPF are underway. In these trials, our team will evaluate lung function in patients with IPF from two distinct demographics in the United States and China. These randomized, double-blind, placebo-controlled clinical trials began recruitment in the summer of 2023, during which they will aim to recruit 120 patients across more than 40 distinct sites. These trials will assess the safety, tolerability, and pharmacokinetics of a 12-week oral dosage of INS018_055 in patients with IPF. For the first time since the initiation of this large body of work, this clinical trial will assess the preliminary efficacy of INS018_055 on lung function in IPF patients, which is a groundbreaking achievement for generative AI in the drug discovery field. Importantly, this undertaking and clinical advancement represents hope for the roughly five million people worldwide suffering from this debilitating disease. With few treatment options available and a lifespan between 2-5 years following diagnosis, the success of INS018_055 may provide IPF patients with a remarkable new opportunity to conquer a largely untreatable disease. 

Figure 4: From project inception in 2019 to the on-going Phase 2a clinical trial in IPF patients, this timeline covers the full scope of our AI driven study. Following its indication prioritization in 2020, INS018_055 was validated as a potent, safe, and selective TNIK inhibitor in gold standard in vitro analyses. Its efficacy was tested in three in vivo murine fibrosis models, passed safety standards in Phase 0 and Phase 1 studies and is currently being tested in two Phase 2a clinical trials in the US/China.

The success of INS018_055 in preclinical studies and phase 1 clinical trials has significant implications for the drug discovery field. Indeed, these accomplishments validate the effectiveness of our AI-enabled drug discovery Pharma AI suite and the comprehensive nature of our tour-de-force study (Figure 4).  Our results also set a precedent for the potential of this technology in accelerating drug discovery in many other contexts. At a fraction of the cost and time of traditional drug discovery method, our team hypothesized a novel IPF therapy, harnessed the power of AI to facilitate the discovery and preclinical testing of this intervention, completed preclinical candidate nomination, and conducted a first-in-human phase 1 clinical trial for this original hypothesis. Using our publication as a guide, one can extrapolate how generative AI drug discovery tools may streamline other discovery efforts. This expansion would directly address many of the challenges that industry R&D faces, as Paul et al. outlined in their seminal 2010 review article. For example, AI-led target discovery/ID efforts may reduce the likelihood that an identified target fails to reach preclinical candidate nomination. This costly step undertaken by academic and private sector actors is a large pit for many research projects- avoiding wrong targets or redundant efforts could be an area in which AI would leave a lasting imprint. Additionally, computational advances in speed and power will continue to drastically streamline the generation of novel small molecules through tools like Chemistry42. Thus, this study underscores the strength of AI-led drug discovery methodology, and similar applications of generative AI technology will be a driving factor in revolutionizing the entire drug discovery field.

A Docuthon to Remember

To commemorate and help visualize the advances made in artificial intelligence drug discovery (AIDD), our group has worked to document this journey from its beginning to its current status in clinical testing. We have compiled hundreds of hours of documentary footage during the timeline of this study to inform, encourage, and demonstrate the possibilities that exist in AIDD to the broader scientific and non-scientific communities. A visual timeline of our TNIK Program is available to the general public, and it recounts the program's story as it unfolded from its AI-backed proposal of TNIK as a potential IPF target in 2019. To further encourage the dissemination of the narration of AI Drug Discovery and Development (AIDDD) and the development of Insilico Medicine, we are hosting a Docuthon to invite filmmakers and science enthusiasts to try their hand at telling our story. This friendly competition aims to provide a platform for producing creative and interesting visual approaches to garner public interest in this rapidly developing research field, and all submissions are welcome.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Life Sciences > Biological Sciences > Biotechnology
Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence
Clinical Trials
Life Sciences > Health Sciences > Biomedical Research > Clinical Research > Clinical Trials
Drug Development
Life Sciences > Health Sciences > Biomedical Research > Pharmacology > Pharmaceutics > Drug Development
Structure-Based Drug Design
Physical Sciences > Chemistry > Biological Chemistry > Medicinal Chemistry > Structure-Based Drug Design
Biomedical Research
Life Sciences > Health Sciences > Biomedical Research