Using knowledge graphs in drug discovery (Part 2): how they’re shaping scientific progress

Share via

Posted: 30 April 2025 | Andreas Kollegger (Head of GenAI innovation at Neo4), Dr Raminderpal Singh (Hitchhikers AI and 20/15 Visioneers) | No comments yet

In this second interview of the series, Andreas Kolleger, Head of GenAI Innovation at Neo4j, discusses how knowledge graphs and AI are transforming scientific discovery and improving life sciences workflows.

Illustration of abstract line and dot. Big data, technology, AI, data transfer, data flow, large language model, generative ai, natural language generative, data mining, AI brain.

Looking towards the next five years, knowledge graphs are poised to play an increasingly central role in scientific discovery, particularly as reasoning capabilities in AI systems continue to mature. As Kollegger observes, the industry is moving beyond simply making the machinery work to focusing on setting frameworks and motivations for scientific discovery. This evolution promises to accelerate the drug discovery process significantly, though with important caveats about the nature and limitations of AI advancement.

The rise of multi-modal knowledge graphs

A major upcoming development is the shift towards multimodal knowledge graphs. While current knowledge graphs are primarily text-based, future systems will likely incorporate diverse data types – such as molecular structures, laboratory measurements and imaging data. This integration of multiple data modalities will be particularly valuable in drug discovery, where insights often emerge from the combination of computational and experimental data. The challenge lies not just in storing this diverse data, but in creating meaningful relationships between different types of scientific information.

While current knowledge graphs are primarily text-based, future systems will likely incorporate diverse data types – such as molecular structures, laboratory measurements and imaging data.

The future landscape will likely see the emergence of what Kollegger terms a “multi-graph space,” where different types of graphs (geospatial, vector-based, textual) coexist and interact. This architectural approach reflects the reality of how scientific knowledge is organised – different aspects of research require different types of representation. For instance, a drug discovery project might simultaneously need to work with molecular structure graphs, protein interaction networks and clinical trial outcome data. Each of these domains has its own natural graph structure, and the future challenge lies in creating meaningful connections between these different graph spaces.

Reserve your FREE place

AI-powered drug discovery: Accelerating the development of life-saving therapies

18 September 2025 | 14:00PM BST | FREE Webinar

Join this webinar to learn how AI is accelerating early-stage drug discovery and improving target identification, practical strategies for applying AI effectively within your organisation and to ask your questions to our industry expert! Dr Remco Jan Geukes Foppen will share practical insights into how AI is being applied across the pharmaceutical sector, helping teams move faster and make better-informed decisions. With experience spanning data management, image analysis, bioinformatics, and machine learning in clinical research, he brings both deep technical expertise and strategic understanding of real-world challenges.

Register Now – It’s Free!

This multigraph approach will be supported by what Kollegger describes as a “mixture of experts” model in AI systems. Rather than relying on a single generalist AI model, future systems will likely employ specialised models for different types of data and analysis, coordinated by a generalist AI that can integrate insights across domains. This approach mirrors the way human experts collaborate in drug discovery, where specialists in different areas work together to develop new therapeutic approaches.

Knowledge graphs as a trust layer

As AI systems become more sophisticated, the role of knowledge graphs as a ‘trust layer’ between AI and human researchers will become increasingly important. While LLMs can generate insights rapidly, knowledge graphs provide a verifiable foundation of curated relationships that researchers can audit and validate. This aspect will prove particularly crucial as AI systems become more sophisticated in their reasoning capabilities. As Kollegger notes, researchers will want to not only see the conclusions drawn by AI systems but also examine the knowledge graphs that were used to derive those conclusions.

As AI systems become more sophisticated, the role of knowledge graphs as a ‘trust layer’ between AI and human researchers will become increasingly important.

The integration of knowledge graphs with reasoning models represents another frontier of development. Current AI models excel at pattern matching and information retrieval, but the next generation of systems will need to perform more sophisticated reasoning tasks. Knowledge graphs will play a crucial role in this evolution by providing structured, validated information that reasoning models can work with. This combination could enable more complex forms of scientific inference while maintaining traceability and reliability.

Lower barriers to adoption

For organisations looking to adopt knowledge graph technology, the entry point is now more accessible than ever. Tools like Neo4j’s Knowledge Graph Builder and frameworks like LangChain offer user-friendly starting points for building and experimenting with knowledge graphs. The key is to start small, allowing researchers to experience what Kollegger calls the “graph epiphany” – the moment when the value of graph-based representation becomes clear in their specific domain. This approach allows organisations to gradually build expertise while delivering immediate value.

The need for better tools

The tooling landscape for knowledge graphs is also expected to evolve significantly. While current tools excel at creating and managing individual graphs, there is a growing need for better tools to handle multi-graph environments and to support the integration of different data modalities. As Kollegger points out, the industry requires improved means of helping users navigate between different levels of abstraction and different types of graph representations.

The goal is not to replace human expertise but to augment it by providing better tools for managing and reasoning about complex scientific knowledge.

Looking beyond pure technology, the future of knowledge graphs in drug discovery will likely be shaped by their ability to support collaboration between human researchers and AI systems. The goal is not to replace human expertise but to augment it by providing better tools for managing and reasoning about complex scientific knowledge. Knowledge graphs offer a unique advantage here because they represent information in a way that is both machine-actionable and human-interpretable.

As reasoning capabilities in AI systems continue to advance, knowledge graphs are expected to become even more valuable as a structured foundation for scientific discovery. Their ability to represent complex relationships in a way that bridges traditional scientific knowledge and emerging AI capabilities positions them as a crucial technology for the future of drug discovery. However, realising this potential will require continued development of both the underlying technology and the tools that make it accessible to researchers.

To understand the foundations of this topic, see part one of this series.

About the authors

Dr Raminderpal Singh

Dr Raminderpal Singh is a recognised visionary in the implementation of AI across technology and science-focused industries. He has over 30 years of global experience leading and advising teams, helping early to mid-stage companies achieve breakthroughs through the effective use of computational modelling.

Raminderpal is currently Global Head of AI and GenAI Practice at 20/15 Visioneers and leads the HitchhikersAI.org open-source community. He is also a co-founder of Incubate Bio – a techbio providing a service to life sciences companies who are looking to accelerate their research and lower their wet lab costs through in silico modelling.

Raminderpal has extensive experience building businesses in both Europe and the US. As a business executive at IBM Research in New York, Dr Singh led the go-to-market for IBM Watson Genomics Analytics. He was also Vice President and Head of the Microbiome Division at Eagle Genomics Ltd, in Cambridge. Raminderpal earned his PhD in semiconductor modelling in 1997. He has published several papers and two books and has twelve issued patents. In 2003, he was selected by EE Times as one of the top 13 most influential people in the semiconductor industry.

For more: http://raminderpalsingh.com; http://20visioneers15.com; http://hitchhikersAI.org

Andreas Kollegger

Andreas is a technological humanist. He began at NASA, designing systems from scratch to support scientific missions. In Zambia, he developed medical informatics systems to apply technology for social good. Now at Neo4j, he’s focused on democratising graph databases to validate and extend our understanding of how the world works.

Related organisations
Hitchhikers AI and 20/15 Visioneers, Neo4j

Cookie	Type	Duration	Description
cookielawinfo-checkbox-advertising-targeting	persistent	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	persistent	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	session	1 year	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	persistent	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	session	1 year	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Type	Duration	Description
advanced_ads_browser_width	persistent	1 month	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	persistent	2 years	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	persistent	1 month	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	persistent	1 year	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	persistent	2 years	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	persistent	2 years	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	persistent	3 months	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	persistent	1 month	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	persistent	5 months	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Type	Duration	Description
bcookie	persistent	2 years	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	persistent	30 minutes	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	session	1 year	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	persistent	1 day	This cookie is set by LinkedIn and used for routing.
lissc	persistent	11 months	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	persistent	2 years	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	persistent	2 years	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	persistent	20 minutes	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	persistent	20 minutes	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	persistent	20 minutes	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	persistent	2 years	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	persistent	1 minute	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	persistent	1 day	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Type	Duration	Description
cf_ob_info	persistent	1 minute	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	persistent	1 minute	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	session	1 year	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	persistent	1 month	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	persistent	Until cleared	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	session	1 year	This cookie is set by Youtube and is used to track the views of embedded videos.

Recommended

Using knowledge graphs in drug discovery (Part 2): how they’re shaping scientific progress

The rise of multi-modal knowledge graphs

AI-powered drug discovery: Accelerating the development of life-saving therapies

Register Now – It’s Free!

Knowledge graphs as a trust layer

Lower barriers to adoption

The need for better tools

Leave a Reply Cancel reply

Recommended

Using knowledge graphs in drug discovery (Part 2): how they’re shaping scientific progress

The rise of multi-modal knowledge graphs

AI-powered drug discovery: Accelerating the development of life-saving therapies

Register Now – It’s Free!

Knowledge graphs as a trust layer

Lower barriers to adoption

The need for better tools

Early De-Risking: In Vitro Safety Strategies for Pre-IND Success

Chronic neuron overactivation drives Parkinson’s cell death

New AI method maps how tuberculosis drugs destroy bacteria

Beyond the hype: a veteran’s honest assessment of AI in drug discovery – Part 3

New M13 virus method could change future genetic treatments

Leave a Reply Cancel reply