news

New framework enhances reliability of virtual cell models

Posted: 13 November 2025 | | No comments yet

Shift Bioscience have announced new research revealing that AI-driven virtual cell models perform far better than previously thought when assessed with correctly calibrated metrics.

Biotechnology company Shift Bioscience have announced new research that introduces an enhanced framework for evaluating benchmark metric calibration in virtual cell models. The findings show that when properly calibrated, virtual cell models consistently outperform key baselines, providing powerful biological data to accelerate the identification of rejuvenation targets.

Improving confidence in AI virtual cells

Shift Bioscience, which is pioneering the use of artificial intelligence to understand and reverse the biology of ageing, has been focused on refining how AI-driven virtual cells are assessed. These genetic perturbation response models simulate how cells react to various genetic changes – such as the up-or down-regulation of specific genes –enabling researchers to predict outcomes without relying only on time-consuming and expensive laboratory experiments.

These models have long promised to speed up the discovery of potential therapeutic targets. However, recent academic studies have questioned their reliability, suggesting that virtual cell models often fail to outperform simple, uninformative baselines in some situations. This criticism has raised doubts about their usage in drug discovery pipelines.

 

access your free copy

 


Automation now plays a central role in discovery. From self-driving laboratories to real-time bioprocessing

This report explores how data-driven systems improve reproducibility, speed decisions and make scale achievable across research and development.

Inside the report:

  • Advance discovery through miniaturised, high-throughput and animal-free systems
  • Integrate AI, robotics and analytics to speed decision-making
  • Streamline cell therapy and bioprocess QC for scale and compliance
  • And more!

This report unlocks perspectives that show how automation is changing the scale and quality of discovery. The result is faster insight, stronger data and better science – access your free copy today

 

Addressing metric miscalibration

In its latest study, Shift Bioscience provides evidence that these concerns may be misplaced. The research team found that poor model performance in some published reports is largely due to metric miscalibration – where the tools used to evaluate the models fail to accurately measure their predictive ability.

According to the study, the metrics used most often struggle to distinguish robust predictions from uninformative ones, particularly when datasets include weaker genetic perturbations.

According to the study, the metrics used most often struggle to distinguish robust predictions from uninformative ones, particularly when datasets include weaker genetic perturbations. In response, Shift’s researchers developed an improved framework for metric calibration that can more reliably assess how well virtual cell models are performing.

Using 14 publicly available perturb-seq datasets, the team identified several rank-based and Differentially Expressed Gene (DEG)-aware metrics that remain well-calibrated across diverse datasets. This new approach allows scientists to more accurately evaluate when a model is genuinely capturing meaningful biological signals.

Consistent outperformance of baselines

When tested using these newly calibrated metrics, virtual cell models demonstrated clear and consistent improvements over traditional uninformative baselines such as mean, control and linear models. The results indicate that AI virtual cells can recognise biologically significant patterns when evaluated appropriately – challenging previous conclusions that questioned their effectiveness.

When tested using these newly calibrated metrics, virtual cell models demonstrated clear and consistent improvements over traditional uninformative baselines such as mean, control and linear models.

By improving the accuracy of virtual cell evaluations, researchers can gain greater confidence in the models’ predictions and more efficiently identify promising genetic targets for ageing and disease interventions.

“This latest research from our talented team provides clear evidence that the reports of poor performance in AI virtual cells is largely due to limitations of metrics, not due to issues with the models,” said Dr Henry Miller, Head of Machine Learning at Shift Bioscience. “We showed that when models are evaluated on well-calibrated metrics, they perform quite well and consistently outperform key baselines. We believe that this work opens the door to more widespread use of virtual cells and reinforces our confidence in the virtual cell models that are helping to drive our target identification program for cell rejuvenation.”

Towards a new era in cellular ageing research

The study’s results highlight how frameworks can help to release the full potential of computational biology – helping the company to tackle the diseases and decline associated with ageing.

Leave a Reply

Your email address will not be published. Required fields are marked *