InteractBind

Large-scale protein-ligand dataset and benchmark that probes whether models genuinely localize binding sites and recover interaction types, rather than merely predicting binding likelihood, using ligand-similarity-controlled splits.

Composite

47.5

Experimental validation

Retrospective

Stages

Hit ID

Modalities

small moleculebiologic

Task types

classificationretrievaldocking

Size

complexes: 0
tasks: 3
splits: {'train': 0, 'val': 0, 'test': 0}
note: Curated from PDB-derived structures; exact counts pending camera-ready release

License

CC-BY

First release

2026-05-21

Last updated

2026-05-21

Official site

→ project page

Leaderboard

→ leaderboard

Dataset

→ dataset

Code / GitHub

→ repository

HuggingFace

→ HF

Paper

A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? · Zhaohan Meng, Zhen Bai, Ke Yuan, Iadh Ounis, Zaiqiao Meng, Hao Xu, Joseph Loscalzo · 2026 · paper · doi:10.48550/arXiv.2605.24045 · 0 citations

Flags

none

Experts

—

Groups

—

Hosted by

—

Related benchmarks

PLINDER, PINDER, PoseBusters, LIT-PCBA

Rubric (7-criterion)

rigor

coverage

maintenance

adoption

quality

accessibility

industry_relevance

Notes

arXiv preprint (21 May 2026). Valuable diagnostic framing: separates true binding-site localization from affinity/likelihood shortcuts, with ligand-similarity-controlled splits to test generalization to novel proteins. Brand new (zero citations), code not yet public, and dataset size not yet finalized — accessibility and adoption scored conservatively. Strong scientific motivation co-authored by a clinical systems-biology group (Loscalzo, Harvard).

← Back to all benchmarks

Compare:

Open comparison →