The scale divide: competing strategies in AI drug discovery

No comments

Two approaches to AI in preclinical drug discovery are diverging, from multi-thousand GPU systems to models with only a handful of parameters, with early results raising questions about which will deliver.

shutterstock_2502153963

In March 2026, two developments occurred within the same fortnight that, taken together, expose a strategic fault line running through AI-driven preclinical drug discovery. On one side, Roche unveiled the pharmaceutical industry’s largest hybrid-cloud AI factory – 3,500 NVIDIA Blackwell graphics processing units (GPUs) deployed across the US and Europe to power Genentech’s Lab-in-the-Loop strategy¹ – while Eli Lilly inaugurated LillyPod, a 1,016-GPU SuperPOD delivering over 9,000 petaflops to train biomedical foundation models for drug discovery and development.² On the other side, a team of researchers published a paper in Nature Communications demonstrating that the fundamental biological rules of RNA base pairing can be learnt from sequences alone, using a model containing just 21 parameters.³

No structural training data, no massive compute – just twenty-one parameters. These are not simply different points on the same continuum. They represent two distinct theories of how AI should engage with the molecular complexity of preclinical discovery – and the drug discovery community has not yet reckoned with the implications of their divergence.

The scale thesis has dominated infrastructure investment throughout 2025 and into 2026. Evo 2, Arc Institute’s genomic foundation model published in Nature, was trained on 9 trillion DNA base pairs with 40 billion parameters to predict functional impacts of genetic variation – including zero-shot BRCA1 variant pathogenicity – across all domains of life.⁴ CellVQ, a single-cell foundation model, deploys 500 million parameters across 68 million cells to learn cell-state representations and build knowledge graphs for biological discovery.⁵ scLong pushes to a billion parameters across all 28,000 human genes to capture long-range regulatory dependencies that conventional models miss.⁶

The logic is clear: drug discovery requires navigating vast chemical and biological spaces, data is increasingly abundant and so scale wins.

At NVIDIA GTC 2026, the Proteina-Complexa generative model was launched for structure-based protein binder design, with Novo Nordisk, Viva Biotech and Manifold Bio already using it to design and experimentally validate therapeutic candidates.⁷ The AlphaFold Protein Structure Database was simultaneously expanded by roughly 30 million AI-predicted protein complex structures through a collaboration between NVIDIA, Google DeepMind, EMBL-EBI and Seoul National University.⁷ The logic is clear: drug discovery requires navigating vast chemical and biological spaces, data is increasingly abundant and so scale wins.

The architecture thesis starts from a different premise: that biological systems contain deep structural regularities and a model encoding those regularities as inductive biases can achieve competitive performance with dramatically fewer parameters. The 21-parameter RNA result is the sharpest illustration, but it is not isolated. EPInformer, also published in Nature Communications in March 2026, outperforms existing methods at predicting enhancer-gene regulatory interactions – a core task in target identification – with only 0.4 million parameters.⁸

Protein2PAM, published in Nature Biotechnology by Profluent, uses an evolution-informed protein language model trained on over 45,000 CRISPR-Cas protospacer adjacent motifs (PAMs) to achieve 50-fold improvements in Cas9 enzyme cleavage activity through computational evolution, without any laboratory screening or structural modelling.⁹ These preclinical tools deliver experimentally confirmed results at a fraction of the compute cost of foundation model approaches.

The economic dimension of this divergence deserves closer examination. Tudor Oprea, CEO of Expert Systems, provided a rare quantitative framing at Bio-IT World in March 2026: generative AI is now running an estimated 10 to 20 million predictions per day across the pharmaceutical industry’s discovery pipelines, yet the field lacks a framework for valuing individual predictions.¹⁰ Oprea’s position – that a trustworthy prediction should be worth at least 5 percent of the experiment it replaces – implies that most AI output may be economically negligible if it fails to demonstrably reduce wet-lab cycles. AstraZeneca’s Predictive Insights Platform alone generates roughly a million daily predictions.^10,11 If only a fraction of these alter a downstream medicinal chemistry decision, the return on the underlying compute infrastructure requires serious scrutiny.

The field lacks a framework for valuing individual predictions.

This scrutiny is especially relevant to the discovery-stage GPU arms race. Roche’s AI factory supports Genentech’s Lab-in-the-Loop – an approach described as having been pioneered over five years.¹ Lilly’s LillyPod, meanwhile, powers the TuneLab platform, offering external biotechs federated access to proprietary discovery models built on $1 billion-worth of Lilly’s proprietary data.² Both represent serious commitments from organisations with deep discovery expertise. But the risk of infrastructure-driven reasoning – where the availability of 3,500 GPUs shapes which discovery questions get asked, rather than the underlying biology determining what compute is needed – is real and historically well-documented in technology-intensive R&D.

The risk of infrastructure-driven reasoning – where the availability of 3,500 GPUs shapes which discovery questions get asked, rather than the underlying biology determining what compute is needed – is real and historically well-documented in technology-intensive R&D.

Emerging evidence suggests that neither thesis is universally correct and that the most effective discovery strategy may be task-decomposed rather than monolithic. Some preclinical problems genuinely require scale: genome-wide target identification across millions of variant positions, multitarget selectivity modelling and virtual screening across billion-compound libraries all involve combinatorial complexity that cannot be reduced through architectural cleverness alone. Evo 2’s cross-domain variant interpretation⁴ and the HAMGNN graph neural network’s integration of LLM-extracted knowledge for drug repurposing across 2.2 million biomedical knowledge graph edges¹² both exemplify discovery problems where data volume and model capacity are irreducible requirements.

Conversely, problems with strong biological priors – RNA secondary structure, enzyme-substrate specificity, protein-PAM recognition, enhancer-gene regulation – appear to be domains where architecture-first approaches deliver disproportionate returns at the discovery bench. The pattern across multiple Q1 2026 publications is consistent: biologically grounded models punch above their parameter weight, achieving performance competitive with or superior to models orders of magnitude larger on specific, well-characterised preclinical tasks.

Biologically grounded models punch above their parameter weight, achieving performance competitive with or superior to models orders of magnitude larger on specific, well-characterised preclinical tasks.

For discovery teams, the practical implication is that a blanket commitment to ‘building foundation models’ or ‘investing in GPU infrastructure’ conflates problems with fundamentally different computational profiles. A portfolio approach – large-scale infrastructure for genuinely data-hungry discovery problems, paired with lightweight biology-first models for tasks where domain knowledge can be formally encoded – is likely to outperform either strategy pursued in isolation. The 21-parameter provocation does not invalidate the scale thesis. Rather it reveals that the field has been treating a portfolio problem as a scaling problem – and that distinction matters enormously when allocating preclinical research budgets in an industry where the estimated cost of bringing a drug to market is measured in billions of dollars.¹³

The organisations that navigate this divergence most effectively will be those that ask, before committing compute to a discovery programme, a deceptively simple question: what does the biology already know that the model does not need to learn from data? In Q1 2026, a growing body of peer-reviewed evidence suggests the answer is considerably more than the industry has assumed.

References

[1] Roche. Roche launches NVIDIA AI factory to accelerate the development of new therapeutics and diagnostics solutions. Press release, 16 March 2026. https://www.roche.com/media/releases/med-cor-2026-03-16

[2] NVIDIA Blog. Now Live: Lilly AI Factory for Pharmaceutical Discovery and Development. 27 February 2026. https://blogs.nvidia.com/blog/lilly-ai-factory-live/

[3] Pratap JS, Krueger RK, Rivas E. Deep-learning models demonstrate that RNA base pairing rules can be learned from sequences alone with minimal parameters. Nature Communications, 26 March 2026. https://www.nature.com/subjects/machine-learning

[4] Brixi G, et al. Genome modelling and design across all domains of life with Evo 2. Nature (2026). DOI: 10.1038/s41586-026-10176-5. https://www.nature.com/articles/s41586-026-10176-5

[5] Wang J, Tan C, Li SZ. CellVQ: Illuminating cell states by a comprehensive and interpretable single cell foundation model. Nature Communications (2026). DOI: 10.1038/s41467-026-70071-5. https://www.nature.com/articles/s41467-026-70071-5

[6] Bai D, et al. scLong: a billion-parameter foundation model for capturing long-range gene context in single-cell transcriptomics. Nature Communications (2026). DOI: 10.1038/s41467-026-69102-y. https://www.nature.com/articles/s41467-026-69102-y

[7] NVIDIA Newsroom. NVIDIA Expands Open Model Families to Power the Next Wave of Agentic, Physical and Healthcare AI. 16 March 2026. https://nvidianews.nvidia.com/news/nvidia-expands-open-model-families-to-power-the-next-wave-of-agentic-physical-and-healthcare-ai

[8] Lin J, Li Z, Zhao Y, et al. EPInformer: scalable and integrative prediction of gene expression from promoter-enhancer sequences with multimodal epigenomic profiles. Nature Communications (2026). DOI: 10.1038/s41467-026-70535-8. https://www.nature.com/articles/s41467-026-70535-8

[9] Nayfach S, et al. Customizing CRISPR-Cas PAM specificity with protein language models. Nature Biotechnology (2026). DOI: 10.1038/s41587-025-02995-0. https://www.nature.com/articles/s41587-025-02995-0

[10] Bio-IT World. The Value of Prediction, and Cost of Success, in Drug Hunting. 26 March 2026. https://www.bio-itworld.com/news/2026/03/26/the-value-of-prediction–and-cost-of-success–in-drug-hunting

[11] AstraZeneca Predictive Insights Platform. Drug Discovery Today (2024). DOI: 10.1016/j.drudis.2024.103945. https://doi.org/10.1016/j.drudis.2024.103945

[12] Saranya K, et al. Generative AI in drug repurposing and biomarker discovery: a multimodal approach (HAMGNN). Frontiers in Bioinformatics (2026). DOI: 10.3389/fbinf.2026.1755412. https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2026.1755412/full

[13] Wouters OJ, McKee M, Luyten J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. JAMA 323(9):844-853 (2020). DOI: 10.1001/jama.2020.1166. https://doi.org/10.1001/jama.2020.1166