CompGen-MLIP: Compositional Generalisation for ML Interatomic Potentials
Benchmark with 4 tasks evaluating compositional generalization of ML interatomic potentials — whether models learn transferable chemistry vs. interpolating training patterns. Relevant to molecular dynamics-based drug design.
Composite
39.8
Experimental validation
Retrospective
Stages
Hit IDLead ID / ADMET
Modalities
small molecule
Task types
regression
Size
tasks: 4
molecules: unknown — compositional split evaluation
molecules: unknown — compositional split evaluation
License
Other
First release
2026-05-09
Last updated
2026-05-09
Official site
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
Benchmarking Compositional Generalisation for Machine Learning Interatomic Potentials · Amir Masoud Nourollah, Irtaza Khalid, Stefano Leoni, Steven Schockaert · 2026 · paper · doi:N/A — preprint · 0 citations
Flags
none
Experts
—
Groups
—
Hosted by
—
Related benchmarks
Rubric (7-criterion)
rigor
4
coverage
2
maintenance
3
adoption
1
quality
3
accessibility
2
industry_relevance
2
Notes
Addresses important gap in MLIP evaluation — OOD generalization to unseen molecular compositions. Shows current models struggle (10x error on OOD). More computational chemistry than direct drug discovery, but relevant to free energy calculations and MD simulations used in drug design. Narrow scope (4 tasks only).