Beyond templates: advancing protein–protein interaction structure prediction with AI

Share via

Posted: 3 October 2025 | Dr Alan Nafiiev (CEO and Founder of Receptor.AI) | No comments yet

Dr Alan Nafiiev evaluates template-based, docking and template-free approaches to PPI prediction, highlighting how AI can enhance structural accuracy.

3D illustration of protein complexes showing molecular surface detail in red, orange and yellow, representing protein–protein interactions relevant to drug discovery.

Introduction

Protein–protein interactions (PPIs) are the specific contacts formed when two proteins bind. Understanding their three-dimensional structures is essential for designing compounds that can modulate PPIs, which is the most advantageous strategy for treating many diseases.¹ However, experimentally determining PPI structures is slow, costly and often technically challenging.

Currently, the primary computational approach for PPI structure prediction has been protein–protein docking, where known protein structures are positioned to identify a plausible interface. This method remains fundamentally limited because it treats proteins as rigid bodies and fails to account for solvent effects, side-chain rearrangements, backbone flexibility and other biophysical factors.²

Now, artificial intelligence (AI) is poised to break through these limits.

access your free copy

Automation now plays a central role in discovery. From self-driving laboratories to real-time bioprocessing

This report explores how data-driven systems improve reproducibility, speed decisions and make scale achievable across research and development.

Inside the report:

Advance discovery through miniaturised, high-throughput and animal-free systems
Integrate AI, robotics and analytics to speed decision-making
Streamline cell therapy and bioprocess QC for scale and compliance
And more!

This report unlocks perspectives that show how automation is changing the scale and quality of discovery. The result is faster insight, stronger data and better science – access your free copy today

Template-based PPI structure prediction

Template-based PPI prediction methods assemble complexes by taking two target sequences, finding a homologous complex in a structural database, and ‘grafting’ the known backbone and interface onto the new pair. When a close template exists, this can yield remarkably accurate interfaces in minutes. Recent generative AI models, such as AlphaFold-Multimer or RoseTTAFold, also draw on these same resolved complexes as part of their training data, but go further by integrating deep multiple‐sequence alignments and co-evolutionary signals.³

Yet the template library itself remains woefully sparse. BioGRID curates evidence for over 1.4 million human PPIs,⁴ yet only a tiny fraction, just 4,594 complexes, have high-resolution structures in PDBbind-plus – a database designed to offer a comprehensive collection of experimental binding affinity data for biomolecular complexes.⁵ This means templates cover under 1 percent of the estimated human interactome, even adding unique structures from other sources. Moreover, the complexes we do know are heavily biased towards stable, soluble, globular assemblies, while most cellular interactions are transient, involve intrinsically disordered regions, or occur at membrane surfaces. Because template-based methods depend directly on the available structural data, their accuracy collapses outside this narrow subset.

Figure 1. The proportion of resolved PPI complexes among known PPIs. Credit: Receptor.AI

Template-free PPI structure prediction

With an example of Receptor.AI’s technology DeepTemplateAGnostic (DeepTAG), an alternative approach can be described. Template‐free PPI prediction takes a fundamentally different tack: instead of searching for a matching scaffold, it first scans each protein surface to locate ‘hot-spots’ – clusters of residues whose side-chain properties (size, hydrophobicity, charge potential and solvent exposure) favour binding.⁶

Figure 2. DeepTAG workflow. Candidate interfaces are split onto three sub-interfaces and scored by different AI layers with different training data. Consensus score is then determined. Credit: Receptor.AI

Once hot-spots on each partner are identified, hot-spot matching is performed to define a limited set of candidate interfaces. For each candidate, contact matrices are constructed that describe which residues on protein A lie within binding distance of residues on protein B.

Here, machine learning excels by using models trained on residue-to-residue contacts within folded domains to score each inter-protein interaction matrix for its predicted binding energy. Because monomeric structures are more highly researched than protein complex structures, this method also works for complexes far from similar to known.

With the best-scored interface defined, the rest of the complex is built around it and the full assembly is tested for stability using molecular dynamics simulations. As the PPI interface itself is the primary target for drug design, ensuring its accuracy, even if other regions are modelled with less precision, is sufficient for downstream drug discovery efforts.

Benchmark results

Figure 3. Comparison of docking and AI methods on the PINDER-AF2 benchmark (30 unbound pairs). Bars show the percentage of complexes achieving Acceptable, Medium, or High DockQ accuracy for each method’s Top-1, Top-5 and All predictions. Credit: Receptor.AI

To objectively compare template-based, docking and template-free workflows, we tested them in a standardised benchmark of challenging targets – PINDER-AF2. Its dataset comprises 30 protein–protein complexes provided only as unbound monomer structures, mirroring real-world scenarios where no prior complex is available. Each method was tasked with generating complexes that were then evaluated by the CAPRI DockQ metric, which scores structural similarity to the native complex on a scale where 0.23–0.49 is Acceptable, 0.49–0.80 is Medium, and above 0.80 is High.⁷ Results are reported for the Top-1 prediction, the best among the Top-5, and across all predictions.

Template-based prediction, exemplified by AlphaFold-Multimer, performs worse than classic rigid-body docking with HDOCK. AlphaFold-Multimer’s metrics barely change when you expand from Top-1 to All predictions, meaning the model simply fails to predict enough high-quality interfaces.

In contrast, template-free prediction already outperforms protein–protein docking even in the Top-1 results. Moreover, the plots show that DeepTAG generates an even larger share of high-quality complexes (nearly half of all candidates reach ‘High’ accuracy), yet not all of them were ranked higher by the model. With ongoing work on improving scoring, the model’s performance in real drug discovery will be truly unmatched.

Conclusion

In conclusion, template-free PPI prediction sidesteps the limits of template scarcity by focusing on protein surface hot-spots. DeepTAG already outperforms protein–protein docking in accuracy and, with ongoing scoring improvements, promises truly unmatched support for PPI targeting in drug discovery. AI is a powerful tool but not a one-size-fits-all solution – it must be applied thoughtfully and in the right context.

Meet the author

Dr Alan Nafiiev is the CEO and Founder of Receptor.AI, where he develops AI-driven infrastructure for drug discovery. With a PhD in computer science, his expertise spans small molecules, peptides, protein–protein interactions and induced proximity therapeutics. His current work and writing focus on hybrid AI–physics methods, large language model-orchestrated R&D strategies, and scalable automation for evidence-based drug design.

References

Singh A, Nguyen B, Ng HL. Editorial: Prediction of protein-protein interactions (PPIs): the next frontier. Frontiers in Molecular Biosciences. 2024;11. doi: 10.3389/fmolb.2024.1479705
Desta IT, Porter KA, Xia B, et al. Performance and its limits in rigid body protein-protein docking. Structure. 2020;28(9):1071-1081.e3. doi: 10.1016/j.str.2020.06.006
Evans R, O’Neill M, Pritzel A, et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv. 2021. doi: 10.1101/2021.10.04.463034
BioGRID. Statistics. The Biological General Repository for Interaction Datasets. Accessed August 15, 2025. https://wiki.thebiogrid.org/doku.php/statistics
PDBbind-Plus database. Accessed August 15, 2025. https://www.pdbbind-plus.org.cn/
Chen YC, Sargsyan K, Wright JD, et al. PPI-hotspotID for detecting protein–protein interaction hot spots from the free protein structure. eLife. 2024;13:RP96643. doi: 10.7554/eLife.96643.3
Basu S, Wallner B. Finding correct protein–protein docking models using ProQDock. Bioinformatics. 2016;32(12):i262-i270. doi: 10.1093/bioinformatics/btw257

Related organisations
Receptor.AI

Cookie	Type	Duration	Description
cookielawinfo-checkbox-advertising-targeting	persistent	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	persistent	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	session	1 year	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	persistent	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	session	1 year	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Type	Duration	Description
advanced_ads_browser_width	persistent	1 month	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	persistent	2 years	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	persistent	1 month	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	persistent	1 year	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	persistent	2 years	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	persistent	2 years	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	persistent	3 months	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	persistent	1 month	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	persistent	5 months	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Type	Duration	Description
bcookie	persistent	2 years	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	persistent	30 minutes	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	session	1 year	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	persistent	1 day	This cookie is set by LinkedIn and used for routing.
lissc	persistent	11 months	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	persistent	2 years	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	persistent	2 years	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	persistent	20 minutes	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	persistent	20 minutes	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	persistent	20 minutes	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	persistent	2 years	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	persistent	1 minute	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	persistent	1 day	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Type	Duration	Description
cf_ob_info	persistent	1 minute	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	persistent	1 minute	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	session	1 year	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	persistent	1 month	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	persistent	Until cleared	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	session	1 year	This cookie is set by Youtube and is used to track the views of embedded videos.

Recommended

Beyond templates: advancing protein–protein interaction structure prediction with AI

Introduction

Automation now plays a central role in discovery. From self-driving laboratories to real-time bioprocessing

Template-based PPI structure prediction

Template-free PPI structure prediction

Benchmark results

Leave a Reply Cancel reply

Recommended

Beyond templates: advancing protein–protein interaction structure prediction with AI

Introduction

Automation now plays a central role in discovery. From self-driving laboratories to real-time bioprocessing

Template-based PPI structure prediction

Template-free PPI structure prediction

Benchmark results

Can One Peptide Change Pain Research? Meet the Tool Transforming Drug Discovery

Circular RNA technology: the future of gene therapy

New framework enhances reliability of virtual cell models

Why first-void urine could potentially change the future of HPV screening

PARP14: the human gene that fights against herpes and COVID

Leave a Reply Cancel reply