news

Novel algorithm uses mass spec data to predict identity of molecules

A new algorithm called MolDiscovery uses mass spectrometry data from molecules to predict their identity and whether they are unknown substances.

Mass spectrometry to identify molecules

An algorithm designed by researchers from Carnegie Mellon University, US, and St Petersburg State University, Russia, could help scientists identify unknown molecules. The algorithm, called MolDiscovery, uses mass spectrometry data from molecules to predict the identity of unknown substances, telling scientists early in their research whether they have uncovered something new or are rediscovered something already known.

According to the team, this development could save time and money in the search for new naturally occurring products that could be used in medicine.

“Scientists waste a lot of time isolating molecules that are already known, essentially rediscovering penicillin,” said Assistant Professor Hosein Mohimani, of the researchers. “Detecting whether a molecule is known or not early on can save time and millions of dollars and will hopefully enable pharmaceutical companies and researchers to better search for novel natural products that could result in the development of new drugs.”

 

Reserve your FREE place

 


AI-powered drug discovery: Accelerating the development of life-saving therapies

18 September 2025 | 14:00PM BST | FREE Webinar

Join this webinar to learn how AI is accelerating early-stage drug discovery and improving target identification, practical strategies for applying AI effectively within your organisation and to ask your questions to our industry expert! Dr Remco Jan Geukes Foppen will share practical insights into how AI is being applied across the pharmaceutical sector, helping teams move faster and make better-informed decisions. With experience spanning data management, image analysis, bioinformatics, and machine learning in clinical research, he brings both deep technical expertise and strategic understanding of real-world challenges.

Register Now – It’s Free!

 

Mohimani explained that after a scientist detects a molecule that holds promise as a potential drug in a marine or soil sample, for example, it can take a year or longer to identify the molecule with no guarantee that the substance is new. MolDiscovery uses mass spectrometry measurements and a predictive machine learning model to identify molecules quickly and accurately.

Mass spectrometry measurements are the fingerprints of molecules, but there is no database to match them against. Even though hundreds of thousands of naturally occurring molecules have been discovered, scientists do not have access to their mass spectrometry data. MolDiscovery predicts the identity of a molecule from the mass spectrometry data without relying on a mass spectra database to match it against.

The team hopes MolDiscovery will be a useful tool for labs in the discovery of novel natural products. 

The team’s work is published in Nature Communications.

Leave a Reply

Your email address will not be published. Required fields are marked *