article

Future-proofing drug development with GenAI

Using GenAI and expert reasoning, drug developers can now explore an asset’s long-term potential as early as the preclinical stage. This shift is helping to reshape pipeline planning and refine therapeutic strategy.

Artificial Intelligence in Life Sciences

It is becoming increasingly evident that generative artificial intelligence (GenAI) is a resourceful tool for helping pharmaceutical companies reduce manual tasks required by clinical trials. However, R&D stakeholders are learning that GenAI underpinned by domain expertise also enables deep dives into the broader long-term potential of their investigational asset(s) as early as the preclinical phase. This is especially relevant with today’s heavier focus on enhancing personalised medicine via broader emerging scientific findings.

Given macro healthcare influences (eg, economic uncertainty, environmental changes) and the numerous available treatments for major diseases, drug developers may need to reassess their therapeutic strategies. Long-term sustainability may benefit from identifying priority indications, exploring emerging mechanisms of action and refining development priorities.

The multi-agent framework of large language models (LLMs), reasoning models and advanced algorithms is revealing the critical unknowns within R&D strategies that impact an asset’s chance of meeting its potential.

Given the full potential of future assets can’t be directly observed or measured, drug developers are often unaware of certain opportunities or risks. Thus, early-stage uncovering of asset value and direction via AI may help them prioritise pipelines and business goals.

Unintentionally limiting asset potential

There is a tremendous level of untapped chemical space that could further the development of drug molecules, yet traditional drug discovery and development has been hindered by its resource-heavy manual approaches to exploring strategic pathways. This has led drug developers to unintentionally limit their potential within chosen therapeutic spaces. With up to 90 percent of assets never making it to market when competition is at an all-time high, it is worth discussing how AI-driven approaches might help set a stronger foundation for possibilities downstream.1

Seeing the bigger picture of strategic options

Because LLMs are trained on extensive, internet-scale datasets, they can learn to identify contexts linking words and language. When grounding an LLM with scientific datasets, the model can learn context that helps identify entities (eg, diseases, symptoms, molecules, etc) across multiple data sources. Whether targeting the same drug, disease, protein, etc, a well-designed LLM grounded in trusted data can understand context between scientific literature, clinical trial results and real-world evidence sources, such as electronic health records or omics datasets.

Because LLMs are trained on extensive, internet-scale datasets, they can learn to identify contexts linking words and language.

Leveraging the extensive breadth of available data to identify entities and relationships across data sources, clinical research experts, therapeutic specialists, machine learning (ML) engineers and others can collectively evaluate areas of interest that may create new opportunities for the asset and a broader clinical strategy.

However, drug developers should know the various ways LLMs and other AI-based methodologies can shed light on asset profile, strategy and potential therapeutic and commercial promise before a trial starts. AI-driven solutions allow drug developers to gauge forward-looking questions, including:

  • What may be achieved with our asset over the next 15 years? Could we expand beyond the initial approved indication?
  • For our therapeutic focus, what future trends should be expected? Any growth areas?
  • Where do we focus our resources and capabilities for pipeline and/or asset direction over the long-term? Which indications have the highest commercial potential?
  • What is the full scope of therapeutic focus(es) and patient subpopulations our asset may reach?
  • How differentiated is our asset from existing and emerging competitors?

New approach methodologies data

In April, the US Food and Drug Administration (FDA) announced a plan to replace animal testing in the development of monoclonal antibodies and other therapies with validated “human-relevant” methods, including AI-based computational models evaluating toxicity, cellular lines and organoid toxicity.2

This plan further encourages the strategic use of AI modelling and real-world human data, which are considered new approach methodologies (NAM) data. Encouraging inclusion of NAMs data in investigational new drug (IND) applications allows drug developers to use AI to produce predictive outcomes regarding asset profiles, including:

  • Creating virtual cohorts (PBPK/PD digital twins) to explore absorption, distribution and metabolism, mitigating risk
  • Using deep learning on chemical structures and historic toxicity data, gauging organ-specific safety issues
  • Mapping on- and off-target interactions across thousands of proteins to prioritise molecules before wet lab screens
  • Exploring precision dosing by combining and reviewing genomic, transcriptomic and exposome data to model response variability.

When looking to analyse safety and efficacy across the development pipeline, it is also possible to integrate imaging, multiomics and clinical endpoints into end-to-end predictive models. These models can help uncover subtle patterns and correlations that may not otherwise be evident. By combining diverse data modalities, drug developers can generate more holistic insights to better anticipate adverse events, stratify patient populations and optimise trial design. This integrated approach also supports earlier go/no-go decisions and more targeted therapeutic development.

Creating a strategic head start with expert reasoning

These models must align with what R&D stakeholders aim to achieve and be based on curated, connected data to provide thorough, accurate and useful outputs. Upon using an LLM to pull, define and organise the context between data, expert human oversight is necessary to offer clinical reasoning and logic to derive meaningful insights made possible from these models. Therapeutic and clinical trial experts with deep understanding of emerging medicine trends and developments can recognise nuanced context and decipher the R&D possibilities.

The ability to extract and analyse layered, connected insights proffers evidence-driven answers about how assets or portfolios can realise their potential. Drug developers may discover: 

  • Mechanistic flexibility. The compound’s mechanism of action (MOA) may be used in a slightly different patient population or adjacent therapeutic focus, enabling label expansion or pursuit of novel indications.
  • Indication prioritisation. Indications of interest may be ranked according to potential for technical success, depth of the unmet need and likelihood of commercial success for more targeted and informed investments.
  • Molecular innovation. With slight modifications to an antibody class, it is possible to explore new therapeutic areas or enhance impact within an existing one. Some single antibody scaffolds are already being applied across different disease areas or are demonstrating improved outcomes for patients within the same indication due to improved targeting or delivery mechanisms.
  • Preclinical advantage. Compared to the standard of care at the same stage of development, a preclinical asset may demonstrate promising activity against a known target but with a more favourable toxicity profile. This early level of insight can help prioritise assets with differentiated potential and reduce risk of downstream attrition.
  • Biomarker and patient stratification insights. Language and reasoning models can help identify predictive biomarkers or patient subgroups more likely to respond to treatment, enabling more precise trial design and a higher likelihood of success.
  • Lifecycle planning and repurposing. By mapping the scientific and clinical landscape over time, developers can spot opportunities to reposition shelved assets, repurpose for rare diseases or explore combinations with synergistic therapies.

Instead of focusing on one-off indications, making tweaks as findings are extracted allows developers to build comprehensive portfolio strategies.

A hindering factor is that knowledge about failed assets and why they failed is often unpublished and limited. It can also be difficult to secure all necessary data to adequately analyse a drug’s potential benefits and alternative uses due to study design, lack of endpoints or the small number of patients enrolled. Such information gaps emphasise the need for clinical data scientists to supervise methodologies.

Connecting the context: what does it technically take?

It is important to recognize that extracting broad insights during preclinical stages is not straightforward. LLMs are trained on internet-scale datasets, which is similar to walking into a library filled with books. While plenty of data is available, vast amounts are not relevant to your search.

Grounding LLMs with particular datasets and engineering prompts helps you access the appropriate data. A knowledge graph can help to visualise meaningful connections across entities.

Grounding LLMs with particular datasets and engineering prompts helps you access the appropriate data. A knowledge graph can help to visualise meaningful connections across entities.

By creating new algorithms and embedding the context of relationships among them, the LLM can act like a librarian who knows where to search among the library’s stacks, as well as how the books you are interested in relate to one another and why.

A graphical representation of connections in knowledge graphs can provide therapeutic and clinical experts with insight to drum up questions. For example, if the aim is to map out a long-term disease strategy in immunology, it would be helpful to gauge what, if any, new biologic pathways or MOAs might spark a paradigm shift in the space or which indications may have stronger commercial potential or interest.

Figure 1. GenAI-driven

Figure 1. GenAI-driven knowledge graph to visualise meaningful connections across entities of interest.

By collaborating with ML scientists and engineers, experts can dive deeper into extracted insights and train LLMs on different sets of tasks and/or datasets to build out scenarios for moving forward and identifying what to prioritise.

Fine-tuning strategic pathways forward

Several years ago, it would have been difficult to imagine drug developers in preclinical stages being able to realistically look a decade ahead with evidence-driven insight into the fuller potential of their assets and pipelines. But just as emerging medicine is personalising care for patients, drug development and commercialisation strategies are also being fine-tuned through advances in LLM frameworks.

Since nothing related to AI sits still for long, we know its promise for helping curate and connect meaningful data insights to show downstream value and potential of assets and pipelines will only grow.

References:

[1] Sun D, Gao W, Hu H, Zhou S. Why 90% of clinical drug development fails and how to improve it? Acta Pharm Sin B. 2022 Jul;12(7):3049-3062.

[2] U.S. Food and Drug Administration. FDA news release, “FDA announces plan to phase out animal testing requirement for monoclonal antibodies and other drugs.” April 10, 2025.

Greg LeverMeet the author

Greg Lever, Director, AI Solutions Delivery, IQVIA

With more than 15 years of life sciences and technology experience, Greg currently helps clients discover innovative ways to bring life-changing therapies to patients faster within IQVIA’s Applied Data Science Center’s consulting sales team. Previously, he led a team of ML engineers within IQVIA’s Analytics Center of Excellence.

Greg has worked with several technology startup companies in London and helped see Genomics England’s 100,000 Genomes Project through project completion. He earned his PhD at the University of Cambridge, combining quantum physics and ML to develop new approaches for small-molecule drug discovery, and has worked as a postdoctoral associate at MIT.

Leave a Reply

Your email address will not be published. Required fields are marked *