CA-DEL

Open multi-target, multi-modal benchmark for learning from noisy DNA-encoded library (DEL) screens against three carbonic anhydrase isoforms, paired with real Ki validation data from ChEMBL to test sim-to-real selectivity prediction.

Composite
50.0
Experimental validation
Retrospective
Stages
Hit ID
Modalities
small molecule
Task types
classificationregression
Size
targets: 3
splits: {'train': 0, 'val': 0, 'test': 0}
note: DEL read counts across three CA isoforms plus ChEMBL Ki validation set
License
Other — open (per paper)
First release
2026-05-08
Last updated
2026-05-08
Official site
→ project page
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
CA-DEL: An Open Multi-Target, Multi-Modal Benchmark for Learning from DNA-Encoded Library Screens · Mutian He, Hanqun Cao, Cheng Tan, Zijun Gao, Xiaojun Yao, Chunbin Gu, Pheng-Ann Heng · 2026 · paper · doi:10.48550/arXiv.2605.07439 · 0 citations
Flags
none
Experts
Groups
Hosted by
Related benchmarks
BELKA (Big Encoded Library for Chemical Assessment), LIT-PCBA, PubChem BioAssay

Rubric (7-criterion)

rigor
4
coverage
3
maintenance
3
adoption
1
quality
4
accessibility
3
industry_relevance
4

Notes

arXiv preprint (8 May 2026). Addresses a genuinely under-benchmarked modality (DEL screens) with explicit sim-to-real grounding via ChEMBL Ki validation across three carbonic anhydrase isoforms — directly relevant to industrial hit-finding. New and not yet cited; selectivity focus and real-world validation lift industry relevance. Complements BELKA as a second open DEL benchmark.

← Back to all benchmarks

Compare:
Open comparison →