Spaces:
Sleeping
Sleeping
File size: 4,897 Bytes
88d7726 6362a5b e3642e5 88d7726 f1a917c 88d7726 6362a5b e3642e5 88d7726 307f851 e3642e5 307f851 e3642e5 f1a917c 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e2a6f29 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 e3642e5 307f851 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | ---
title: FetchMerck AI Demo
emoji: 🩺
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 6.14.0
app_file: app.py
pinned: false
hf_oauth: true
hf_oauth_scopes:
- inference-api
license: apache-2.0
short_description: Lightweight RAG demo for clinical decision support
---
# 🩺 FetchMerck AI — Demo
> A lightweight, public demonstration of a **Retrieval-Augmented Generation (RAG)** pipeline for clinical decision support.
[](https://www.gradio.app/)
[](https://www.apache.org/licenses/LICENSE-2.0)
[]()
---
## ⚠️ Medical Disclaimer
> **This Space is an educational prototype only.**
> It is **not** a medical device and **must not** be used for diagnosis, treatment, triage, or any clinical decision-making. Outputs may be inaccurate or incomplete. **Always consult a licensed clinician** for medical questions.
---
## ✨ What it does
FetchMerck AI demonstrates a minimal, end-to-end RAG pipeline you can read in a single afternoon:
- Embeds your question with `sentence-transformers/all-MiniLM-L6-v2`.
- Retrieves the top-k most similar passages from a small corpus via **cosine similarity over a NumPy matrix** (no vector DB needed).
- Asks a hosted instruction-tuned LLM via the **Hugging Face Inference API** to answer **only** from the retrieved context.
- Surfaces the source topic names alongside every answer, plus the standing disclaimer.
It ships with two corpus modes:
| Mode | When it activates | Source |
| --- | --- | --- |
| **MedlinePlus corpus** | When `data/corpus.jsonl` and `data/embeddings.npy` are present | NIH MedlinePlus Health Topics (public domain) |
| **Tiny sample corpus** | Fallback, in-memory | Built-in, ensures the demo always boots |
---
## 🔬 How it works
1. The user enters a clinical question.
2. The query is embedded with MiniLM and L2-normalized.
3. Cosine similarity is computed against the corpus matrix; the top-k passages are selected.
4. Those passages are concatenated into a grounded context window.
5. A hosted LLM is prompted to answer **only** from that context.
6. The answer is rendered with source topic names and the medical disclaimer.
---
## 🛠️ Build the MedlinePlus corpus locally
A local-only ingest script lives at `scripts/ingest_medline.py`. It downloads the latest **MedlinePlus Health Topics XML** (public domain), chunks each topic summary, and embeds the chunks with MiniLM.
```bash
python -m venv .venv && source .venv/bin/activate
pip install -U sentence-transformers numpy lxml
python scripts/ingest_medline.py
```
This produces:
- `data/corpus.jsonl` — one chunk per line: `{id, topic, section, url, text}`
- `data/embeddings.npy` — float32 matrix, L2-normalized, shape `(N, 384)`
Optional environment variables for the script:
| Variable | Purpose |
| --- | --- |
| `MEDLINE_XML_URL` | Pin a specific snapshot (e.g. `https://medlineplus.gov/xml/mplus_topics_YYYY-MM-DD.xml.zip`) |
| `EMBED_MODEL` | Override the embedding model |
| `CHUNK_TOKENS` | Chunk size in tokens (default `300`) |
| `CHUNK_OVERLAP` | Chunk overlap in tokens (default `50`) |
Then drag `data/corpus.jsonl` and `data/embeddings.npy` into the **Files** tab of this Space (under a top-level `data/` folder). The Space will pick them up on next restart.
---
## ⚙️ Configuration
Optional environment variables / Space secrets:
| Variable | Default | Purpose |
| --- | --- | --- |
| `HF_TOKEN` | — | Hugging Face token (needed for gated or private generation models) |
| `GEN_MODEL` | `meta-llama/Llama-3.1-8B-Instruct` | Override the hosted generation model |
| `EMBED_MODEL` | `sentence-transformers/all-MiniLM-L6-v2` | Override the embedding model |
---
## 🗺️ Roadmap
- [x] Lightweight publishable v0 with sample corpus
- [x] MedlinePlus ingest script + auto-load when uploaded
- [ ] Add additional public-domain / openly licensed corpora (CDC, NICE OGL, OpenStax)
- [ ] Move retrieval to a persistent vector store (e.g. Chroma) once the corpus grows
- [ ] Optional local GGUF inference on GPU hardware
---
## 🚫 What this Space deliberately does **not** do
- It does **not** include or redistribute the *Merck Manuals* or any other restricted, paywalled, or copyrighted clinical reference content.
- It does **not** provide medical advice.
---
## 📚 Attribution
Health-topic content used by the prebuilt corpus is adapted from **MedlinePlus**, a service of the U.S. National Library of Medicine, National Institutes of Health. MedlinePlus content is in the public domain and free to reuse.
This project is **not affiliated with, endorsed by, or sponsored by** NLM, NIH, or HHS.
---
## 📄 License
Released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|