Part two: how ChatGPT enriched animal study results

Recently there has been a flurry of announcements from AI-led biotechs around the potential of Large Language Models (LLM) in early drug discovery. In the second of a three-part series, Dr Raminderpal Singh presents an example of usage of ChatGPT, which demonstrates how accessible LLMs have become for lab scientists.

shutterstock_2419574549

In our previous article, we summarised the role and challenges of LLMs for early drug discovery. In this article we provide a simple case example to download and practise with ChatGPT, or other accessible LLM systems. This shows the power that LLMs offer to improve scientists’ daily tasks, despite their caveats and challenges. You can download all the source files to use yourself^{1 -}(see Simple ChatGPT Exercise). Thank you to Nina Truter²for her support in building this example. The example should work with any LLM but has been tested with ChatGPT.³

About the example

Goal of the example: Using extracted measurements from 10 papers on acarbose treated mice to improve the recommendations made from the results of the primary study.
Key outputs required from the example: Recommendation on dose, participants and measurements based on results from the primary study^4,5and papers on acarbose treated mice, with supporting data points.
Challenges faced in implementing the example: Creating prompts to accurately extract information to support recommendations, accurately describing the content of multiple files and papers.

Importantly, you should be aware that commonly accessible LLM systems often share inputs you provide, so it is recommended not to enter confidential information.

To help ChatGPT provide useful insights, there needs to be some ‘prompt engineering’. This is a technical term for best-practices in the way prompts are written. As an example, the first prompt in this example is only to provide background and context to ChatGPT:

“You are a drug discovery scientist looking to make decisions on dose, participants and measurements when taking an existing diabetes drug into the ageing-related diseases field. You have experimental results from a mouse study that show the effects of acarbose on lifespan, body weight, body composition, fat pads, glucose, grip strength, grip duration, rotarod and pathology. You also have several relevant scientific publications with studies investigating the effects of acarbose on different measurements in mice. You now want to interrogate your study results (which are in Excel files and images) and the publications separately for insights, and then together to get the best set of recommendations for your colleagues who are looking to perform early clinical trials with acarbose on ageing-related diseases. To do this, you will now process a series of specific user-entered ChatGPT prompts.”

The screenshot below shows the results from the last prompt. There are some nuances ChatGPT has not picked up on. For example, in female mice, the lifespan is not extended as much compared to male mice, but their physical measurements are improved. Improved prompts will aid the generation of more nuanced results.

Figure from Dr Raminderpal Singh, illustrating prompt results.

Figure from Dr Raminderpal Singh, illustrating prompt results.

Please comment below to share your findings from the example. Tell us if you managed to improve the output and, if so, how?

The next article in this series, published Monday 24 July, will discuss key challenges in the effective use of LLMs for early drug discovery, and present some practical approaches to address them.

References

¹Reading. HitchhikersAI.org. Available at: https://www.hitchhikersai.org/reading

²Nina Truter. LinkedIn. Available at: https://www.linkedin.com/in/nina-truter/

³ChatGPT. Available at: https://chatgpt.com/

⁴Alavez S, et al. Acarbose improves health and lifespan in aging HET3 mice. Aging Cell. 18(2) (2019 April). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6413665/

⁵Harrison DE, et al. ITP: Interventions Testing Program: Effects of various treatments on lifespan and related phenotypes in genetically heterogenous mice (UM-HET3) (2004-2023). Mouse Phenome Database. Available at: https://phenome.jax.org/projects/ITP1