Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Bio Text Retrieval — Resources</title> | |
| <style> | |
| @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap'); | |
| :root { | |
| --bg: #0a0a0b; | |
| --surface: #111113; | |
| --surface-2: #18181b; | |
| --border: #27272a; | |
| --border-hover: #3f3f46; | |
| --text: #fafafa; | |
| --text-2: #a1a1aa; | |
| --text-3: #71717a; | |
| --accent: #6ee7b7; | |
| --accent-dim: rgba(110, 231, 183, 0.1); | |
| --accent-2: #67e8f9; | |
| --accent-3: #c4b5fd; | |
| --accent-4: #fda4af; | |
| --accent-5: #fcd34d; | |
| --radius: 12px; | |
| --radius-sm: 8px; | |
| } | |
| * { margin: 0; padding: 0; box-sizing: border-box; } | |
| html { scroll-behavior: smooth; } | |
| body { | |
| font-family: 'Inter', -apple-system, BlinkMacSystemFont, sans-serif; | |
| background: var(--bg); | |
| color: var(--text); | |
| line-height: 1.6; | |
| -webkit-font-smoothing: antialiased; | |
| } | |
| /* ── NAV ─────────────────────────── */ | |
| nav { | |
| position: fixed; top: 0; left: 0; right: 0; z-index: 100; | |
| background: rgba(10, 10, 11, 0.8); | |
| backdrop-filter: blur(20px); | |
| border-bottom: 1px solid var(--border); | |
| padding: 0 2rem; | |
| } | |
| nav .inner { | |
| max-width: 1200px; margin: 0 auto; | |
| display: flex; align-items: center; justify-content: space-between; | |
| height: 56px; | |
| } | |
| nav .logo { | |
| font-weight: 600; font-size: 0.95rem; letter-spacing: -0.02em; | |
| display: flex; align-items: center; gap: 8px; | |
| } | |
| nav .logo span { color: var(--accent); } | |
| nav .links { display: flex; gap: 6px; } | |
| nav .links a { | |
| color: var(--text-2); text-decoration: none; font-size: 0.82rem; | |
| font-weight: 450; padding: 6px 12px; border-radius: 6px; | |
| transition: all 0.15s; | |
| } | |
| nav .links a:hover { color: var(--text); background: var(--surface-2); } | |
| /* ── HERO ─────────────────────────── */ | |
| .hero { | |
| padding: 140px 2rem 80px; | |
| text-align: center; | |
| position: relative; | |
| overflow: hidden; | |
| } | |
| .hero::before { | |
| content: ''; | |
| position: absolute; top: 60px; left: 50%; transform: translateX(-50%); | |
| width: 600px; height: 400px; | |
| background: radial-gradient(ellipse, rgba(110,231,183,0.06) 0%, transparent 70%); | |
| pointer-events: none; | |
| } | |
| .hero h1 { | |
| font-size: clamp(2.4rem, 5vw, 3.8rem); | |
| font-weight: 700; | |
| letter-spacing: -0.04em; | |
| line-height: 1.1; | |
| margin-bottom: 1rem; | |
| } | |
| .hero h1 em { | |
| font-style: normal; | |
| background: linear-gradient(135deg, var(--accent), var(--accent-2)); | |
| -webkit-background-clip: text; -webkit-text-fill-color: transparent; | |
| } | |
| .hero p { | |
| color: var(--text-2); font-size: 1.1rem; max-width: 560px; | |
| margin: 0 auto 2rem; font-weight: 350; | |
| } | |
| .hero .tag-row { | |
| display: flex; gap: 8px; justify-content: center; flex-wrap: wrap; | |
| } | |
| .tag { | |
| font-size: 0.75rem; font-weight: 500; padding: 5px 12px; | |
| border-radius: 100px; border: 1px solid var(--border); | |
| color: var(--text-2); background: var(--surface); | |
| } | |
| /* ── SECTION ──────────────────────── */ | |
| section { | |
| max-width: 1200px; margin: 0 auto; | |
| padding: 60px 2rem 0; | |
| } | |
| .section-head { | |
| margin-bottom: 2rem; | |
| } | |
| .section-head h2 { | |
| font-size: 1.5rem; font-weight: 650; letter-spacing: -0.03em; | |
| display: flex; align-items: center; gap: 10px; | |
| } | |
| .section-head h2 .icon { | |
| width: 32px; height: 32px; border-radius: var(--radius-sm); | |
| display: grid; place-items: center; font-size: 1rem; | |
| } | |
| .section-head p { | |
| color: var(--text-3); font-size: 0.9rem; margin-top: 6px; | |
| max-width: 600px; | |
| } | |
| /* ── TIMELINE ─────────────────────── */ | |
| .timeline { | |
| position: relative; | |
| padding-left: 32px; | |
| } | |
| .timeline::before { | |
| content: ''; | |
| position: absolute; left: 7px; top: 0; bottom: 0; width: 2px; | |
| background: linear-gradient(to bottom, var(--accent), var(--border) 80%, transparent); | |
| } | |
| .timeline-item { | |
| position: relative; | |
| margin-bottom: 2.5rem; | |
| } | |
| .timeline-item::before { | |
| content: ''; | |
| position: absolute; left: -29px; top: 8px; | |
| width: 10px; height: 10px; border-radius: 50%; | |
| background: var(--accent); | |
| box-shadow: 0 0 8px rgba(110,231,183,0.4); | |
| } | |
| .timeline-item .year { | |
| font-family: 'JetBrains Mono', monospace; | |
| font-size: 0.75rem; color: var(--accent); font-weight: 500; | |
| margin-bottom: 4px; | |
| } | |
| .timeline-item h3 { | |
| font-size: 1.05rem; font-weight: 600; letter-spacing: -0.02em; | |
| margin-bottom: 4px; | |
| } | |
| .timeline-item h3 a { | |
| color: var(--text); text-decoration: none; | |
| transition: color 0.15s; | |
| } | |
| .timeline-item h3 a:hover { color: var(--accent); } | |
| .timeline-item .desc { | |
| color: var(--text-2); font-size: 0.88rem; line-height: 1.55; | |
| } | |
| .timeline-item .meta { | |
| display: flex; gap: 8px; margin-top: 8px; flex-wrap: wrap; | |
| } | |
| .timeline-item .meta a { | |
| font-size: 0.72rem; padding: 3px 10px; border-radius: 100px; | |
| text-decoration: none; font-weight: 500; | |
| border: 1px solid var(--border); color: var(--text-3); | |
| transition: all 0.15s; | |
| } | |
| .timeline-item .meta a:hover { | |
| border-color: var(--accent); color: var(--accent); | |
| } | |
| /* ── CARD GRID ────────────────────── */ | |
| .grid { | |
| display: grid; | |
| grid-template-columns: repeat(auto-fill, minmax(320px, 1fr)); | |
| gap: 16px; | |
| } | |
| .card { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: var(--radius); | |
| padding: 20px; | |
| transition: border-color 0.2s, transform 0.2s; | |
| text-decoration: none; color: inherit; | |
| display: flex; flex-direction: column; | |
| } | |
| .card:hover { | |
| border-color: var(--border-hover); | |
| transform: translateY(-2px); | |
| } | |
| .card .card-top { | |
| display: flex; justify-content: space-between; align-items: flex-start; | |
| margin-bottom: 10px; | |
| } | |
| .card h3 { | |
| font-size: 0.95rem; font-weight: 600; letter-spacing: -0.01em; | |
| } | |
| .card .badge { | |
| font-size: 0.68rem; font-weight: 600; padding: 3px 9px; | |
| border-radius: 100px; white-space: nowrap; flex-shrink: 0; | |
| } | |
| .badge-model { background: rgba(110,231,183,0.12); color: var(--accent); } | |
| .badge-dataset { background: rgba(103,232,249,0.12); color: var(--accent-2); } | |
| .badge-bench { background: rgba(196,181,253,0.12); color: var(--accent-3); } | |
| .badge-paper { background: rgba(253,164,175,0.12); color: var(--accent-4); } | |
| .badge-training { background: rgba(252,211,77,0.12); color: var(--accent-5); } | |
| .card .desc { | |
| color: var(--text-2); font-size: 0.84rem; flex: 1; | |
| line-height: 1.5; | |
| } | |
| .card .card-footer { | |
| margin-top: 12px; display: flex; gap: 6px; flex-wrap: wrap; | |
| } | |
| .card .pill { | |
| font-family: 'JetBrains Mono', monospace; | |
| font-size: 0.68rem; padding: 3px 8px; border-radius: 4px; | |
| background: var(--surface-2); color: var(--text-3); | |
| } | |
| /* ── BENCHMARK TABLE ─────────────── */ | |
| .table-wrap { | |
| overflow-x: auto; | |
| border: 1px solid var(--border); | |
| border-radius: var(--radius); | |
| background: var(--surface); | |
| } | |
| table { | |
| width: 100%; border-collapse: collapse; | |
| font-size: 0.84rem; | |
| } | |
| thead th { | |
| text-align: left; padding: 12px 16px; | |
| font-weight: 600; font-size: 0.78rem; color: var(--text-3); | |
| text-transform: uppercase; letter-spacing: 0.05em; | |
| border-bottom: 1px solid var(--border); | |
| position: sticky; top: 0; background: var(--surface); | |
| } | |
| tbody td { | |
| padding: 11px 16px; border-bottom: 1px solid var(--border); | |
| color: var(--text-2); | |
| } | |
| tbody tr:last-child td { border-bottom: none; } | |
| tbody tr:hover { background: var(--surface-2); } | |
| tbody td:first-child { font-weight: 500; color: var(--text); } | |
| td a { | |
| color: var(--accent); text-decoration: none; | |
| } | |
| td a:hover { text-decoration: underline; } | |
| /* ── LEADERBOARD TABLE ───────────── */ | |
| .lb-rank { | |
| font-family: 'JetBrains Mono', monospace; | |
| font-weight: 600; color: var(--accent); font-size: 0.85rem; | |
| } | |
| .lb-score { | |
| font-family: 'JetBrains Mono', monospace; | |
| font-weight: 500; color: var(--accent-5); | |
| } | |
| /* ── RECIPE CARDS ────────────────── */ | |
| .recipe-grid { | |
| display: grid; | |
| grid-template-columns: repeat(auto-fill, minmax(360px, 1fr)); | |
| gap: 16px; | |
| } | |
| .recipe-card { | |
| background: var(--surface); | |
| border: 1px solid var(--border); | |
| border-radius: var(--radius); | |
| padding: 20px; position: relative; | |
| overflow: hidden; | |
| } | |
| .recipe-card::before { | |
| content: ''; | |
| position: absolute; top: 0; left: 0; right: 0; height: 3px; | |
| } | |
| .recipe-card:nth-child(1)::before { background: var(--accent); } | |
| .recipe-card:nth-child(2)::before { background: var(--accent-2); } | |
| .recipe-card:nth-child(3)::before { background: var(--accent-3); } | |
| .recipe-card .rank { | |
| font-family: 'JetBrains Mono', monospace; | |
| font-size: 0.72rem; color: var(--text-3); margin-bottom: 6px; | |
| } | |
| .recipe-card h3 { font-size: 1rem; font-weight: 600; margin-bottom: 8px; } | |
| .recipe-card .recipe-desc { | |
| color: var(--text-2); font-size: 0.84rem; line-height: 1.5; | |
| } | |
| .recipe-card .recipe-result { | |
| margin-top: 12px; padding: 10px 14px; | |
| background: var(--surface-2); border-radius: var(--radius-sm); | |
| font-family: 'JetBrains Mono', monospace; | |
| font-size: 0.78rem; color: var(--accent); | |
| } | |
| /* ── PATH SECTION ────────────────── */ | |
| .path-list { | |
| counter-reset: step; | |
| } | |
| .path-step { | |
| display: flex; gap: 16px; margin-bottom: 1.5rem; | |
| align-items: flex-start; | |
| } | |
| .path-step .num { | |
| counter-increment: step; | |
| width: 36px; height: 36px; border-radius: 50%; | |
| background: var(--accent-dim); | |
| border: 1px solid rgba(110,231,183,0.2); | |
| display: grid; place-items: center; | |
| font-family: 'JetBrains Mono', monospace; | |
| font-size: 0.82rem; font-weight: 600; | |
| color: var(--accent); flex-shrink: 0; | |
| } | |
| .path-step .content h3 { | |
| font-size: 0.95rem; font-weight: 600; | |
| margin-bottom: 4px; | |
| } | |
| .path-step .content p { | |
| color: var(--text-2); font-size: 0.84rem; | |
| } | |
| .path-step .content a { | |
| color: var(--accent); text-decoration: none; | |
| } | |
| .path-step .content a:hover { text-decoration: underline; } | |
| /* ── FOOTER ───────────────────────── */ | |
| footer { | |
| max-width: 1200px; margin: 80px auto 0; | |
| padding: 30px 2rem; | |
| border-top: 1px solid var(--border); | |
| display: flex; justify-content: space-between; align-items: center; | |
| flex-wrap: wrap; gap: 12px; | |
| } | |
| footer p { | |
| color: var(--text-3); font-size: 0.78rem; | |
| } | |
| footer a { color: var(--accent); text-decoration: none; } | |
| footer a:hover { text-decoration: underline; } | |
| /* ── RESPONSIVE ───────────────────── */ | |
| @media (max-width: 640px) { | |
| nav .links { display: none; } | |
| .grid, .recipe-grid { | |
| grid-template-columns: 1fr; | |
| } | |
| .hero { padding: 120px 1.5rem 50px; } | |
| section { padding: 40px 1.5rem 0; } | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <!-- NAV --> | |
| <nav> | |
| <div class="inner"> | |
| <div class="logo">🧬 <span>BioRetrieval</span></div> | |
| <div class="links"> | |
| <a href="#evolution">Evolution</a> | |
| <a href="#models">Models</a> | |
| <a href="#benchmarks">Benchmarks</a> | |
| <a href="#datasets">Datasets</a> | |
| <a href="#leaderboard">Leaderboard</a> | |
| <a href="#start">Get Started</a> | |
| </div> | |
| </div> | |
| </nav> | |
| <!-- HERO --> | |
| <div class="hero"> | |
| <h1>Biomedical <em>Text Retrieval</em></h1> | |
| <p>A curated map of papers, models, datasets, and benchmarks for dense retrieval in the biomedical domain.</p> | |
| <div class="tag-row"> | |
| <span class="tag">15 papers</span> | |
| <span class="tag">10+ models</span> | |
| <span class="tag">12 benchmarks</span> | |
| <span class="tag">7 training datasets</span> | |
| </div> | |
| </div> | |
| <!-- EVOLUTION TIMELINE --> | |
| <section id="evolution"> | |
| <div class="section-head"> | |
| <h2><span class="icon">📜</span> Evolution of BioRetrieval</h2> | |
| <p>From domain pretraining to LLM-based retrievers — key milestones that shaped the field.</p> | |
| </div> | |
| <div class="timeline"> | |
| <div class="timeline-item"> | |
| <div class="year">2019</div> | |
| <h3><a href="https://arxiv.org/abs/1901.08746">BioBERT</a></h3> | |
| <div class="desc">First BERT fine-tuned on PubMed abstracts + PMC full texts. Proved domain pretraining consistently improves biomedical NER, RE, and QA. The baseline everything is measured against.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/dmis-lab/biobert-v1.1">🤗 dmis-lab/biobert-v1.1</a> | |
| <a href="https://arxiv.org/abs/1901.08746">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2019</div> | |
| <h3><a href="https://arxiv.org/abs/1903.10676">SciBERT</a></h3> | |
| <div class="desc">BERT pretrained on 1.14M scientific papers (18% biomed, 82% CS). Broader scientific vocabulary makes it competitive on retrieval and used as backbone for SLEDGE-Z (TREC-COVID SOTA).</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/allenai/scibert_scivocab_uncased">🤗 allenai/scibert</a> | |
| <a href="https://arxiv.org/abs/1903.10676">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2020</div> | |
| <h3><a href="https://arxiv.org/abs/2007.15779">PubMedBERT</a></h3> | |
| <div class="desc">Showed pretraining from scratch on PubMed beats continual pretraining from general BERT. Introduced the BLURB benchmark. Became the de-facto backbone for biomedical retrieval fine-tuning.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract">🤗 microsoft/PubMedBERT</a> | |
| <a href="https://arxiv.org/abs/2007.15779">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2021</div> | |
| <h3><a href="https://arxiv.org/abs/2010.11784">SapBERT</a></h3> | |
| <div class="desc">Self-alignment pretraining using UMLS ontology + metric learning. SOTA on medical entity linking without task-specific supervision — the go-to for entity disambiguation and concept normalization.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/cambridgeltl/SapBERT-from-PubMedBERT-fulltext-mean-token">🤗 cambridgeltl/SapBERT</a> | |
| <a href="https://arxiv.org/abs/2010.11784">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2021</div> | |
| <h3><a href="https://arxiv.org/abs/2104.08663">BEIR Benchmark</a></h3> | |
| <div class="desc">18 datasets, 9 tasks — revealed that dense models trained on MS MARCO generalize poorly to biomedical domains. BM25 often wins out-of-distribution. The standard evaluation framework.</div> | |
| <div class="meta"> | |
| <a href="https://github.com/beir-cellar/beir">GitHub</a> | |
| <a href="https://arxiv.org/abs/2104.08663">arXiv · NeurIPS '21</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2022</div> | |
| <h3><a href="https://arxiv.org/abs/2203.15827">BioLinkBERT</a></h3> | |
| <div class="desc">Pretrains via hyperlinked documents in the same context window + Document Relation Prediction. Excels at multi-hop biomedical reasoning (BioASQ, USMLE QA).</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/michiyasunaga/BioLinkBERT-large">🤗 michiyasunaga/BioLinkBERT-large</a> | |
| <a href="https://arxiv.org/abs/2203.15827">arXiv · ACL '22</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2023</div> | |
| <h3><a href="https://arxiv.org/abs/2307.00589">MedCPT / BioCPT</a></h3> | |
| <div class="desc">Trained on 255M PubMed user click logs via contrastive learning — zero-shot SOTA on 5 biomedical IR tasks. Released as query/article encoder pair + cross-encoder reranker. Click logs as free supervision.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/ncbi/MedCPT-Query-Encoder">🤗 Query Encoder (382K↓)</a> | |
| <a href="https://hf.co/ncbi/MedCPT-Article-Encoder">🤗 Article Encoder</a> | |
| <a href="https://hf.co/ncbi/MedCPT-Cross-Encoder">🤗 Cross-Encoder</a> | |
| <a href="https://arxiv.org/abs/2307.00589">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2023</div> | |
| <h3><a href="https://arxiv.org/abs/2311.16075">BioLORD-2023</a></h3> | |
| <div class="desc">Grounds biomedical concepts in UMLS definitions via multi-phase contrastive learning + LLM self-distillation + weight averaging. SOTA on MedSTS, MedNLI-S, EHR-Rel-B. Multilingual variants available.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/FremyCompany/BioLORD-2023">🤗 FremyCompany/BioLORD-2023 (147K↓)</a> | |
| <a href="https://arxiv.org/abs/2311.16075">arXiv · EMNLP '23</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2024</div> | |
| <h3><a href="https://arxiv.org/abs/2404.18443">BMRetriever</a></h3> | |
| <div class="desc">LLM-based retriever: unsupervised contrastive pretraining on PubMed/textbooks/StatPearls, then instruction fine-tuning on 11 datasets. 410M model outperforms baselines 11.7× larger. 2B matches 5B+ models.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/BMRetriever/BMRetriever-410M">🤗 410M</a> | |
| <a href="https://hf.co/BMRetriever/BMRetriever-2B">🤗 2B</a> | |
| <a href="https://hf.co/BMRetriever/BMRetriever-7B">🤗 7B</a> | |
| <a href="https://arxiv.org/abs/2404.18443">arXiv · EMNLP '24</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2024</div> | |
| <h3><a href="https://arxiv.org/abs/2511.08029">BiCA</a></h3> | |
| <div class="desc">Citation-aware hard negatives: 2-hop citation graphs from PubMed articles for semantic hard-negative mining. Fine-tunes GTE-small/base with only 20K examples — consistent BEIR + LoTTE gains.</div> | |
| <div class="meta"> | |
| <a href="https://github.com/NiravBhattLab/BiCA">GitHub</a> | |
| <a href="https://arxiv.org/abs/2511.08029">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2025</div> | |
| <h3><a href="https://arxiv.org/abs/2507.19407">MedTE + MedTEB</a></h3> | |
| <div class="desc">51-task medical embedding benchmark (classification, clustering, retrieval). MedTE model (GTE-Base fine-tuned on 7 medical corpora) achieves mean 0.578 vs 0.539 next-best. The new comprehensive eval standard.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/MohammadKhodadad/MedTE">🤗 MohammadKhodadad/MedTE</a> | |
| <a href="https://github.com/MohammadKhodadad/MedTEB">GitHub</a> | |
| <a href="https://arxiv.org/abs/2507.19407">arXiv</a> | |
| </div> | |
| </div> | |
| <div class="timeline-item"> | |
| <div class="year">2025</div> | |
| <h3><a href="https://arxiv.org/abs/2604.15591">BioHiCL</a></h3> | |
| <div class="desc">Hierarchical MeSH supervision: depth-weighted contrastive loss + LoRA on BGE models. 0.1B model achieves IR Avg 0.543, beating BMRetriever-1B. Best on NFCorpus and SCIDOCS. Current efficiency SOTA.</div> | |
| <div class="meta"> | |
| <a href="https://hf.co/LunaLan07/BioHiCL-Base">🤗 BioHiCL-Base</a> | |
| <a href="https://hf.co/LunaLan07/BioHiCL-Large">🤗 BioHiCL-Large</a> | |
| <a href="https://arxiv.org/abs/2604.15591">arXiv</a> | |
| </div> | |
| </div> | |
| </div> | |
| </section> | |
| <!-- MODELS --> | |
| <section id="models"> | |
| <div class="section-head"> | |
| <h2><span class="icon">🧠</span> Models</h2> | |
| <p>Production-ready retrieval models available on the Hugging Face Hub.</p> | |
| </div> | |
| <div class="grid"> | |
| <a class="card" href="https://hf.co/ncbi/MedCPT-Query-Encoder" target="_blank"> | |
| <div class="card-top"> | |
| <h3>MedCPT (Query + Article)</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">Asymmetric bi-encoder from NCBI, trained on 255M PubMed click logs. Separate query/article encoders + cross-encoder reranker. Zero-shot SOTA on biomedical IR.</div> | |
| <div class="card-footer"> | |
| <span class="pill">382K ↓</span> | |
| <span class="pill">BERT-base</span> | |
| <span class="pill">ncbi</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/FremyCompany/BioLORD-2023" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BioLORD-2023</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">UMLS-grounded sentence embeddings via multi-phase contrastive + LLM distillation + weight averaging. SOTA on clinical STS, entity linking, and concept similarity. Multilingual variants available.</div> | |
| <div class="card-footer"> | |
| <span class="pill">147K ↓</span> | |
| <span class="pill">MPNet</span> | |
| <span class="pill">sentence-transformers</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/BMRetriever/BMRetriever-410M" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BMRetriever (410M–7B)</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">LLM-based retriever family. Instruction-formatted queries + last-token pooling. 410M outperforms 5B+ baselines. Models at 410M (GPT-NeoX), 2B (Gemma), 7B (Mistral).</div> | |
| <div class="card-footer"> | |
| <span class="pill">410M–7B</span> | |
| <span class="pill">MIT</span> | |
| <span class="pill">EMNLP '24</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/LunaLan07/BioHiCL-Base" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BioHiCL</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">MeSH hierarchy-supervised BGE model. Depth-weighted contrastive loss + LoRA. 0.1B params achieves IR Avg 0.543, beating BMRetriever-1B. Best efficiency/performance ratio.</div> | |
| <div class="card-footer"> | |
| <span class="pill">110M</span> | |
| <span class="pill">BERT</span> | |
| <span class="pill">2025</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/cambridgeltl/SapBERT-from-PubMedBERT-fulltext-mean-token" target="_blank"> | |
| <div class="card-top"> | |
| <h3>SapBERT</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">Self-alignment on UMLS synonyms via metric learning. Go-to model for medical entity linking and concept normalization. No task-specific labels needed.</div> | |
| <div class="card-footer"> | |
| <span class="pill">457K ↓</span> | |
| <span class="pill">PubMedBERT</span> | |
| <span class="pill">NAACL '21</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/MohammadKhodadad/MedTE" target="_blank"> | |
| <div class="card-top"> | |
| <h3>MedTE</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">GTE-Base fine-tuned on 7 diverse medical corpora (PubMed, MIMIC-IV, ClinicalTrials, bioRxiv/medRxiv). Mean 0.578 on MedTEB — best medical general-purpose embedding model.</div> | |
| <div class="card-footer"> | |
| <span class="pill">~110M</span> | |
| <span class="pill">GTE-Base</span> | |
| <span class="pill">2025</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/michiyasunaga/BioLinkBERT-large" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BioLinkBERT</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">Document-link pretraining on PubMed hyperlinks. Excels at multi-hop biomedical reasoning — top performer on BioASQ QA and USMLE-style questions.</div> | |
| <div class="card-footer"> | |
| <span class="pill">7.3K ↓</span> | |
| <span class="pill">BERT-large</span> | |
| <span class="pill">ACL '22</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract" target="_blank"> | |
| <div class="card-top"> | |
| <h3>PubMedBERT</h3> | |
| <span class="badge badge-model">Model</span> | |
| </div> | |
| <div class="desc">The backbone model for biomedical fine-tuning. Pretrained from scratch on PubMed — not continued from general BERT. Foundation for MedCPT, SapBERT, and many others.</div> | |
| <div class="card-footer"> | |
| <span class="pill">110M</span> | |
| <span class="pill">microsoft</span> | |
| <span class="pill">BLURB</span> | |
| </div> | |
| </a> | |
| </div> | |
| </section> | |
| <!-- BENCHMARKS --> | |
| <section id="benchmarks"> | |
| <div class="section-head"> | |
| <h2><span class="icon">📊</span> Benchmarks</h2> | |
| <p>Standard evaluation suites for biomedical retrieval — use these to measure your models.</p> | |
| </div> | |
| <div class="table-wrap"> | |
| <table> | |
| <thead> | |
| <tr> | |
| <th>Benchmark</th> | |
| <th>Task</th> | |
| <th>Domain</th> | |
| <th>Scale</th> | |
| <th>Metric</th> | |
| <th>Link</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td>NFCorpus</td> | |
| <td>Ad-hoc search</td> | |
| <td>Nutrition / Medicine</td> | |
| <td>323 queries · 3.6K docs</td> | |
| <td>nDCG@10</td> | |
| <td><a href="https://hf.co/datasets/BeIR/nfcorpus">🤗 BeIR/nfcorpus</a></td> | |
| </tr> | |
| <tr> | |
| <td>TREC-COVID</td> | |
| <td>Ad-hoc retrieval</td> | |
| <td>COVID-19 / CORD-19</td> | |
| <td>50 queries · 171K docs</td> | |
| <td>nDCG@10</td> | |
| <td><a href="https://hf.co/datasets/BeIR/trec-covid">🤗 BeIR/trec-covid</a></td> | |
| </tr> | |
| <tr> | |
| <td>SciFact</td> | |
| <td>Claim verification</td> | |
| <td>Scientific claims</td> | |
| <td>~300 queries · 5K abstracts</td> | |
| <td>nDCG@10</td> | |
| <td><a href="https://hf.co/datasets/BeIR/scifact">🤗 BeIR/scifact</a></td> | |
| </tr> | |
| <tr> | |
| <td>BioASQ</td> | |
| <td>QA retrieval</td> | |
| <td>Biomedical QA</td> | |
| <td>Varies annually</td> | |
| <td>MAP, nDCG</td> | |
| <td><a href="http://participants-area.bioasq.org/">bioasq.org</a></td> | |
| </tr> | |
| <tr> | |
| <td>SCIDOCS</td> | |
| <td>Document similarity</td> | |
| <td>Scientific papers</td> | |
| <td>1K queries · 25K docs</td> | |
| <td>nDCG@10</td> | |
| <td><a href="https://hf.co/datasets/BeIR/scidocs">🤗 BeIR/scidocs</a></td> | |
| </tr> | |
| <tr> | |
| <td>BIOSSES</td> | |
| <td>Sentence similarity</td> | |
| <td>Biomedical</td> | |
| <td>100 sentence pairs</td> | |
| <td>Pearson r</td> | |
| <td><a href="https://hf.co/datasets/tabilab/biosses">🤗 tabilab/biosses</a></td> | |
| </tr> | |
| <tr> | |
| <td>PubMedQA</td> | |
| <td>QA retrieval</td> | |
| <td>PubMed abstracts</td> | |
| <td>1K labeled</td> | |
| <td>Accuracy</td> | |
| <td><a href="https://hf.co/datasets/qiaojin/PubMedQA">🤗 PubMedQA</a></td> | |
| </tr> | |
| <tr> | |
| <td>MedTEB</td> | |
| <td>51 medical tasks</td> | |
| <td>Pan-medical</td> | |
| <td>Comprehensive</td> | |
| <td>Multi-metric</td> | |
| <td><a href="https://github.com/MohammadKhodadad/MedTEB">GitHub</a></td> | |
| </tr> | |
| <tr> | |
| <td>R2MED</td> | |
| <td>Reasoning retrieval</td> | |
| <td>Clinical decision</td> | |
| <td>Multi-type</td> | |
| <td>nDCG@10</td> | |
| <td><a href="https://arxiv.org/abs/2505.14558">arXiv</a></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| </section> | |
| <!-- DATASETS --> | |
| <section id="datasets"> | |
| <div class="section-head"> | |
| <h2><span class="icon">🗂️</span> Training Datasets</h2> | |
| <p>Key corpora and labeled data for training biomedical retrieval models.</p> | |
| </div> | |
| <div class="grid"> | |
| <a class="card" href="https://hf.co/datasets/MedRAG/pubmed" target="_blank"> | |
| <div class="card-top"> | |
| <h3>MedRAG/pubmed</h3> | |
| <span class="badge badge-dataset">Dataset</span> | |
| </div> | |
| <div class="desc">PubMed abstracts corpus — the core pretraining data for biomedical models. Used by BMRetriever, MedTE, and most domain-adapted models.</div> | |
| <div class="card-footer"> | |
| <span class="pill">pretraining</span> | |
| <span class="pill">abstracts</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/datasets/MedRAG/textbooks" target="_blank"> | |
| <div class="card-top"> | |
| <h3>MedRAG/textbooks</h3> | |
| <span class="badge badge-dataset">Dataset</span> | |
| </div> | |
| <div class="desc">Medical textbook passages — high-quality, structured biomedical knowledge. Core fine-tuning data for BMRetriever and RAG applications.</div> | |
| <div class="card-footer"> | |
| <span class="pill">fine-tuning</span> | |
| <span class="pill">textbooks</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/datasets/MedRAG/statpearls" target="_blank"> | |
| <div class="card-top"> | |
| <h3>MedRAG/statpearls</h3> | |
| <span class="badge badge-dataset">Dataset</span> | |
| </div> | |
| <div class="desc">StatPearls clinical reference articles — continuously updated clinical content used for retriever fine-tuning and medical Q&A.</div> | |
| <div class="card-footer"> | |
| <span class="pill">clinical</span> | |
| <span class="pill">fine-tuning</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/datasets/BMRetriever/biomed_retrieval_dataset" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BMRetriever Training Mix</h3> | |
| <span class="badge badge-training">Training</span> | |
| </div> | |
| <div class="desc">11-task instruction mixture for biomedical retrieval fine-tuning — query-document pairs spanning medical QA, entity linking, and scientific claim verification.</div> | |
| <div class="card-footer"> | |
| <span class="pill">instruction</span> | |
| <span class="pill">11 tasks</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/datasets/FremyCompany/BioLORD-Dataset" target="_blank"> | |
| <div class="card-top"> | |
| <h3>BioLORD Dataset</h3> | |
| <span class="badge badge-dataset">Dataset</span> | |
| </div> | |
| <div class="desc">UMLS concept definition pairs for contrastive learning. Powers BioLORD-2023's clinical concept embeddings and medical entity similarity.</div> | |
| <div class="card-footer"> | |
| <span class="pill">UMLS</span> | |
| <span class="pill">contrastive</span> | |
| </div> | |
| </a> | |
| <a class="card" href="https://hf.co/datasets/allenai/cord19" target="_blank"> | |
| <div class="card-top"> | |
| <h3>CORD-19</h3> | |
| <span class="badge badge-dataset">Dataset</span> | |
| </div> | |
| <div class="desc">COVID-19 Open Research Dataset — 400K+ research papers. The corpus behind TREC-COVID, used for pandemic-era retrieval research and benchmarking.</div> | |
| <div class="card-footer"> | |
| <span class="pill">400K+ papers</span> | |
| <span class="pill">COVID-19</span> | |
| </div> | |
| </a> | |
| </div> | |
| </section> | |
| <!-- LEADERBOARD --> | |
| <section id="leaderboard"> | |
| <div class="section-head"> | |
| <h2><span class="icon">🏆</span> Training Recipes Leaderboard</h2> | |
| <p>Ranked by result quality — the best published approaches for training a biomedical retriever.</p> | |
| </div> | |
| <div class="table-wrap"> | |
| <table> | |
| <thead> | |
| <tr> | |
| <th>#</th> | |
| <th>Model</th> | |
| <th>Params</th> | |
| <th>Training Recipe</th> | |
| <th>Best Result</th> | |
| <th>Paper</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td class="lb-rank">1</td> | |
| <td>BioHiCL-Base</td> | |
| <td>0.1B</td> | |
| <td>BGE + MeSH hierarchy contrastive (depth-weighted) + LoRA</td> | |
| <td class="lb-score">IR Avg 0.543, NFCorpus 0.379</td> | |
| <td><a href="https://arxiv.org/abs/2604.15591">2604.15591</a></td> | |
| </tr> | |
| <tr> | |
| <td class="lb-rank">2</td> | |
| <td>BMRetriever-2B</td> | |
| <td>2B</td> | |
| <td>LLM + unsupervised contrastive on PubMed/textbooks + instruction FT</td> | |
| <td class="lb-score">Matches 5B+ across 11 tasks</td> | |
| <td><a href="https://arxiv.org/abs/2404.18443">2404.18443</a></td> | |
| </tr> | |
| <tr> | |
| <td class="lb-rank">3</td> | |
| <td>MedTE</td> | |
| <td>~0.1B</td> | |
| <td>GTE-Base + self-supervised contrastive on 7 medical corpora</td> | |
| <td class="lb-score">MedTEB mean 0.578</td> | |
| <td><a href="https://arxiv.org/abs/2507.19407">2507.19407</a></td> | |
| </tr> | |
| <tr> | |
| <td class="lb-rank">4</td> | |
| <td>BiCA-Base</td> | |
| <td>~0.1B</td> | |
| <td>GTE-Base + 2-hop citation hard negatives, 20K examples</td> | |
| <td class="lb-score">Consistent BEIR + LoTTE ↑</td> | |
| <td><a href="https://arxiv.org/abs/2511.08029">2511.08029</a></td> | |
| </tr> | |
| <tr> | |
| <td class="lb-rank">5</td> | |
| <td>MedCPT</td> | |
| <td>~0.1B</td> | |
| <td>PubMedBERT + 255M click-log contrastive (retriever + reranker)</td> | |
| <td class="lb-score">Zero-shot SOTA on 5 bio IR tasks</td> | |
| <td><a href="https://arxiv.org/abs/2307.00589">2307.00589</a></td> | |
| </tr> | |
| <tr> | |
| <td class="lb-rank">6</td> | |
| <td>BioLORD-2023</td> | |
| <td>~0.1B</td> | |
| <td>PubMedBERT + UMLS definitions contrastive + LLM distillation + WA</td> | |
| <td class="lb-score">SOTA MedSTS, EHR-Rel-B</td> | |
| <td><a href="https://arxiv.org/abs/2311.16075">2311.16075</a></td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| </section> | |
| <!-- GET STARTED --> | |
| <section id="start"> | |
| <div class="section-head"> | |
| <h2><span class="icon">🚀</span> Get Started</h2> | |
| <p>Recommended learning path for biomedical text retrieval.</p> | |
| </div> | |
| <div class="path-list"> | |
| <div class="path-step"> | |
| <div class="num">1</div> | |
| <div class="content"> | |
| <h3>Understand the evaluation landscape</h3> | |
| <p>Read the <a href="https://arxiv.org/abs/2104.08663">BEIR paper</a> to understand why domain generalization is hard. Run BM25 as your baseline on <a href="https://hf.co/datasets/BeIR/nfcorpus">NFCorpus</a> — it's surprisingly competitive and sets a meaningful floor.</p> | |
| </div> | |
| </div> | |
| <div class="path-step"> | |
| <div class="num">2</div> | |
| <div class="content"> | |
| <h3>Try a zero-shot retriever</h3> | |
| <p>Use <a href="https://hf.co/ncbi/MedCPT-Query-Encoder">MedCPT</a> — the cleanest example of domain-specific contrastive pretraining. Separate query + article encoders make it intuitive. Evaluate on BEIR biomedical subsets.</p> | |
| </div> | |
| </div> | |
| <div class="path-step"> | |
| <div class="num">3</div> | |
| <div class="content"> | |
| <h3>Scale up with LLM retrievers</h3> | |
| <p>Deploy <a href="https://hf.co/BMRetriever/BMRetriever-410M">BMRetriever-410M</a> for production — it outperforms models 11× larger. Use instruction-formatted queries with last-token pooling. The eval code is clean and well-documented.</p> | |
| </div> | |
| </div> | |
| <div class="path-step"> | |
| <div class="num">4</div> | |
| <div class="content"> | |
| <h3>Comprehensive evaluation</h3> | |
| <p>Benchmark on <a href="https://github.com/MohammadKhodadad/MedTEB">MedTEB</a> — 51 medical embedding tasks, much broader than BEIR biomedical subsets alone. This is the new comprehensive standard (2025).</p> | |
| </div> | |
| </div> | |
| <div class="path-step"> | |
| <div class="num">5</div> | |
| <div class="content"> | |
| <h3>Fine-tune your own retriever</h3> | |
| <p>Use <a href="https://arxiv.org/abs/2511.08029">BiCA's</a> citation-graph hard negatives for cheap, effective training data. Or <a href="https://arxiv.org/abs/2604.15591">BioHiCL's</a> MeSH hierarchy supervision — 0.1B params matching 1B+ models.</p> | |
| </div> | |
| </div> | |
| </div> | |
| </section> | |
| <!-- FOOTER --> | |
| <footer> | |
| <p>Built with 🧬 as a resource hub for biomedical text retrieval research.</p> | |
| <p>All linked resources are maintained by their respective authors. <a href="https://huggingface.co/lvwerra">@lvwerra</a></p> | |
| </footer> | |
| </body> | |
| </html> |