A machine learning model that could identify antibody targets

Using a machine learning model, scientists could predict not only the virus an antibody will attack, but which features on the pathogen the antibody binds to.

Machine learning Process

A new study by University of Illinois Urbana-Champaign, US has shown that by using machine learning, it is possible to use the genetic sequences of a person’s antibodies to predict what pathogens those antibodies will target. Recently published in Immunity, the new approach successfully differentiates between antibodies against influenza and those attacking SARS-CoV-2.

Antibodies (green, aqua, pink) attacking the SARS-CoV-2 virus (yellow/orange sphere)

Different antibodies (green, aqua, pink) attack different parts of the SARS-CoV-2 viral particle (yellow/orange sphere). The virus’s spike proteins (purple) are a key antibody target, with some antibodies attaching to the top (darker purple) and others to the stem (paler zone)
[Credit: Graphic by Yiquan Wang}.

“Our research is in a very early stage, but this proof-of-concept study shows that we can use machine learning to connect the sequence of an antibody to its function,” said Professor Nicholas Wu.

With enough data, scientists should be able to predict not only the virus an antibody will attack, but which features on the pathogen the antibody binds to, for example, an antibody may attach to different parts of the spike protein on the SARS-CoV-2 virus. This knowledge will allow scientists to predict the strength of a person’s immune defence, as some targets of a pathogen are more vulnerable than others.

“In 20 years, scientists have discovered about 5,000 antibodies against the flu virus,” explained Wu. “But in just two years, people have identified 8,000 antibodies for COVID. This provides an opportunity that has never been seen before to study how antibodies work and to do this kind of prediction.”

The researchers used antibody data from 88 published studies and 13 patents. The datasets were big enough to allow the researchers to train their model to make predictions based on the antibodies’ genetic sequence.

The model was designed to distinguish whether the sequences coded for antibodies targeting regions on the influenza virus or on the SARS-CoV-2 virus. The accuracy of these predictions was close to 85 percent overall.

ARTICLE: Transformational machine learning approach could accelerate drug design

The team is working to improve its model so that it can more precisely determine which parts of the virus the antibodies attack.

“If we can make these predictions based on antibody sequence, we might also be able to go back and design antibodies that bind to specific pathogens,” Wu concluded. “This is not something that we can do now, but those are some implications for future study.”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.