From fragments to maps: scaling drug–target interaction data
Posted: 31 March 2026 | Drug Target Review | No comments yet
Most drug–target data were never designed to be compared at scale. Pharmome mapping takes a different approach, building a shared dataset intended to support more predictable discovery.


Drug discovery has never lacked ideas. What it has lacked is reliable, comparable and comprehensive data. For decades, the industry has generated vast volumes of drug–target interaction data, yet much of it remains fragmented, inconsistent and inaccessible. The result is a system in which promising hypotheses struggle to translate into predictable outcomes in humans.
A collaboration between EvE Bio, Convergent Research, DrugBank and Hugging Face is focused on generating a systematic, standardised map of how approved drugs interact with human druggable targets. Described by its creators as pharmome mapping, the effort aims to transform decades of scattered pharmacological observations into a coherent public dataset for drug discovery.
A focused approach to hard scientific problems
The pharmome-mapping project sits within a broader organisational experiment led by Convergent Research. Founded to address scientific bottlenecks that fall between traditional academic funding and commercial incentives, Convergent designs and launches Focused Research Organisations, or FROs. These are time-limited, technically ambitious entities created to deliver clearly defined scientific outputs.
Large-scale public datasets occupy an awkward middle ground in science funding. They’re often too applied and infrastructure-heavy for traditional academic grants, which favour hypothesis-driven research and novel discoveries.
Anastasia Gamick, President and co-founder of Convergent Research, explains why this structure is particularly well suited to large public data efforts such as pharmome mapping. As she puts it:
“Large-scale public datasets occupy an awkward middle ground in science funding. They’re often too applied and infrastructure-heavy for traditional academic grants, which favour hypothesis-driven research and novel discoveries. But they’re also too pre-competitive for industry, which has little incentive to fund resources that benefit competitors equally.”
FROs are designed specifically for this kind of work, combining start-up-style execution with a public-benefit mandate. Rather than operating as permanent institutions, they are built around a defined goal, a realistic timeline and a clear theory of impact, allowing teams to focus entirely on delivery rather than continuous grant-seeking or commercial positioning.
EvE Bio is one of almost a dozen FROs launched by Convergent since 2022. Built specifically to generate a large-scale, open dataset of drug–protein interactions, it brings together full-time scientists, engineers and data specialists working towards a single deliverable on a fixed timeline.
What pharmome mapping actually means
EvE Bio’s focus is not the full human proteome, but the network of functional interactions between drugs and human druggable targets. This network is what the organisation refers to as the pharmome.
The field has accumulated decades of drug-target interaction data, but it’s limited to particular targets of interest, generated under inconsistent conditions, missing confirmed negative activity and largely sequestered inside corporations.
Elaine McVey Houskeeper, CEO and co-founder of EvE Bio, frames the problem in terms of translation. While most experimental compounds never reach humans, approved drugs represent a uniquely valuable resource. They have been tested in clinical trials, used in real-world settings and, in many cases, prescribed for years. Yet the molecular understanding of how these drugs interact across biological systems remains incomplete.
As Houskeeper explains, the issue is not a lack of information, but the nature of what is available:
“The field has accumulated decades of drug-target interaction data, but it’s limited to particular targets of interest, generated under inconsistent conditions, missing confirmed negative activity and largely sequestered inside corporations. What is publicly available is staggeringly incomplete. It’s data, but it’s not a dataset.”
EvE Bio’s pharmome-mapping project is designed to change that by generating interaction data at scale under consistent experimental conditions. The organisation is systematically testing FDA-approved small molecule drugs against hundreds of validated human druggable targets, recording both activity and confirmed inactivity. The goal is to create a dataset that is comprehensive enough and standardised enough to be genuinely useful as infrastructure.
How the pharmome data is generated
Producing hundreds of thousands of drug–target interaction measurements requires more than high-throughput. Reliability depends on assay quality, experimental consistency and careful data processing.
EvE Bio’s platform is structured around three stages. First, an assay development group optimises and validates assays for each target class, ensuring consistent formats across pharmacological modes and signalling pathways. These assays are then passed to a quantitative screening team, which runs multi-concentration screens with replicates to capture the full spectrum of activity rather than binary outcomes. Finally, a data science team applies a uniform processing pipeline to every interaction tested, active or inactive.
This structure allows EvE Bio to scale without sacrificing comparability. Each data point is generated under defined conditions, using consistent methods and accompanied by detailed metadata. The resulting dataset grows incrementally, with new releases made publicly available as soon as they are validated.
How the data is shared
From the outset, pharmome mapping has been conceived as a public good. As a non-profit FRO, EvE Bio releases its data openly rather than reserving it for commercial advantage. This approach reflects a belief that foundational datasets deliver the greatest value when they are widely used, combined and scrutinised by the community.
To support that goal, EvE Bio has focused on distribution through platforms researchers already rely on. The dataset is available programmatically via Hugging Face, enabling immediate access for machine learning practitioners and is also being integrated directly into DrugBank’s intelligence platform.
Turning interaction data into decisions
For DrugBank, the integration of pharmome data represents a significant expansion of its role in drug discovery workflows. Long recognised as a trusted source of structured drug data, DrugBank is now embedding large-scale interaction data alongside information on biology, diseases, clinical trials and sponsors.
Lisa Downey, a life sciences and health-data executive who joined DrugBank as CEO this past October, describes the value of this integration in practical terms.
“What ties these together is that isolated interaction data has limited utility. The value emerges when you can immediately ask follow-up questions: what’s the clinical precedent, what else do we know about this target, who’s already exploring this space?”
By linking EvE Bio’s systematic interaction measurements to DrugBank’s curated operating system, users can move from isolated observations to informed prioritisation. This has implications for safety pharmacology, drug repurposing, polypharmacology and computational modelling, particularly in early discovery where uncertainty is highest.
Ground truth for machine learning
The pharmome dataset also addresses a longstanding challenge in AI-driven drug discovery, the lack of high-quality public training data. Machine learning models are highly sensitive to bias, noise and missing negatives, all of which have plagued historical interaction datasets.
Machine learning models in drug discovery are only as good as their training data. Public datasets have historically been plagued by inconsistent methods and missing negatives.
Georgia Channing, AI for Science Lead at Hugging Face, highlights the importance of EvE Bio’s standardised approach and the opportunities it creates for the research community:
“Machine learning models in drug discovery are only as good as their training data. Public datasets have historically been plagued by inconsistent methods and missing negatives. EvE’s data, with its standardised protocols and rigorous reporting of inactivity, provides the kind of clean, reproducible ground truth that model builders need.”
Hugging Face’s role is to ensure that this data is easy to access and combine with other resources. By hosting it in standard formats and supporting one-line loading, the platform lowers technical barriers and encourages collaboration between domain scientists and machine learning researchers.
Early use cases and the road ahead
Although still expanding, the pharmome map is already being used in advanced AI research. EvE Bio’s data has served as a ground truth dataset for training and evaluating Ether0, a 24-billion-parameter chemistry reasoning model developed by FutureHouse. This example illustrates how the dataset is already being used beyond its original development context.
The dataset continues to grow on a bi-monthly release cadence. At the time of the DrugBank and Hugging Face announcements, it comprised 385,572 tested interactions across 159 targets. Subsequent releases have expanded coverage to more than 476,000 interactions across 207 targets, all tested against a library of 1,397 FDA-approved compounds.
Looking ahead to 2026, EvE Bio plans to significantly expand coverage of GPCRs and protein kinases, including pathway-specific data to support modelling of biased signalling. A second library of drugs and metabolites is also in development. By the end of its five-year timeline, the organisation aims to deliver a comprehensive, standardised map across major druggable protein families, with data released openly as it is generated and validated. DrugBank intends to continue expanding its coverage across compounds, targets, assay types and evolving value for its users. It will do this by embedding this data – alongside clinical trial history, outcomes and competitive activity data – into guided, AI-assisted workflows that support drug, target and disease prioritisation.
Infrastructure for more predictable discovery
Pharmome mapping will not eliminate failure from drug discovery. By replacing fragmented, selective data with a shared, standardised foundation, it offers a way to reduce uncertainty earlier in the process. The collaboration between EvE Bio, Convergent Research, DrugBank and Hugging Face illustrates how open infrastructure, built with discipline and intent, can change what is possible for the entire field.
Rather than asking whether a compound interacts with a single target, researchers can begin to ask broader, more relevant questions about molecular behaviour, safety and translational potential.
Related topics
Artificial Intelligence, Assays, Big Data, Bioinformatics, Drug Discovery, Drug Discovery Processes, Drug Targets, High-Throughput Screening (HTS), Pharmacology, Translational Science
Related organisations
Convergent Research, DrugBank, EvE Bio, FutureHouse, Hugging Face


