CA-DEL
Open multi-target, multi-modal benchmark for learning from noisy DNA-encoded library (DEL) screens against three carbonic anhydrase isoforms, paired with real Ki validation data from ChEMBL to test sim-to-real selectivity prediction.
Composite
50.0
Experimental validation
Retrospective
Stages
Hit ID
Modalities
small molecule
Task types
classificationregression
Size
targets: 3
splits: {'train': 0, 'val': 0, 'test': 0}
note: DEL read counts across three CA isoforms plus ChEMBL Ki validation set
splits: {'train': 0, 'val': 0, 'test': 0}
note: DEL read counts across three CA isoforms plus ChEMBL Ki validation set
License
Other — open (per paper)
First release
2026-05-08
Last updated
2026-05-08
Official site
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
CA-DEL: An Open Multi-Target, Multi-Modal Benchmark for Learning from DNA-Encoded Library Screens · Mutian He, Hanqun Cao, Cheng Tan, Zijun Gao, Xiaojun Yao, Chunbin Gu, Pheng-Ann Heng · 2026 · paper · doi:10.48550/arXiv.2605.07439 · 0 citations
Flags
none
Experts
—
Groups
—
Hosted by
—
Related benchmarks
Rubric (7-criterion)
rigor
4
coverage
3
maintenance
3
adoption
1
quality
4
accessibility
3
industry_relevance
4
Notes
arXiv preprint (8 May 2026). Addresses a genuinely under-benchmarked modality (DEL screens) with explicit sim-to-real grounding via ChEMBL Ki validation across three carbonic anhydrase isoforms — directly relevant to industrial hit-finding. New and not yet cited; selectivity focus and real-world validation lift industry relevance. Complements BELKA as a second open DEL benchmark.