| --- |
| title: FetchMerck AI Demo |
| emoji: ๐ฉบ |
| colorFrom: yellow |
| colorTo: purple |
| sdk: gradio |
| sdk_version: 6.14.0 |
| app_file: app.py |
| pinned: false |
| hf_oauth: true |
| hf_oauth_scopes: |
| - inference-api |
| license: apache-2.0 |
| short_description: Lightweight RAG demo for clinical decision support |
| --- |
| |
| # ๐ฉบ FetchMerck AI โ Demo |
|
|
| > A lightweight, public demonstration of a **Retrieval-Augmented Generation (RAG)** pipeline for clinical decision support. |
|
|
| [](https://www.gradio.app/) |
| [](https://www.apache.org/licenses/LICENSE-2.0) |
| []() |
|
|
| --- |
|
|
| ## โ ๏ธ Medical Disclaimer |
|
|
| > **This Space is an educational prototype only.** |
| > It is **not** a medical device and **must not** be used for diagnosis, treatment, triage, or any clinical decision-making. Outputs may be inaccurate or incomplete. **Always consult a licensed clinician** for medical questions. |
|
|
| --- |
|
|
| ## โจ What it does |
|
|
| FetchMerck AI demonstrates a minimal, end-to-end RAG pipeline you can read in a single afternoon: |
|
|
| - Embeds your question with `sentence-transformers/all-MiniLM-L6-v2`. |
| - Retrieves the top-k most similar passages from a small corpus via **cosine similarity over a NumPy matrix** (no vector DB needed). |
| - Asks a hosted instruction-tuned LLM via the **Hugging Face Inference API** to answer **only** from the retrieved context. |
| - Surfaces the source topic names alongside every answer, plus the standing disclaimer. |
|
|
| It ships with two corpus modes: |
|
|
| | Mode | When it activates | Source | |
| | --- | --- | --- | |
| | **MedlinePlus corpus** | When `data/corpus.jsonl` and `data/embeddings.npy` are present | NIH MedlinePlus Health Topics (public domain) | |
| | **Tiny sample corpus** | Fallback, in-memory | Built-in, ensures the demo always boots | |
|
|
| --- |
|
|
| ## ๐ฌ How it works |
|
|
| 1. The user enters a clinical question. |
| 2. The query is embedded with MiniLM and L2-normalized. |
| 3. Cosine similarity is computed against the corpus matrix; the top-k passages are selected. |
| 4. Those passages are concatenated into a grounded context window. |
| 5. A hosted LLM is prompted to answer **only** from that context. |
| 6. The answer is rendered with source topic names and the medical disclaimer. |
|
|
| --- |
|
|
| ## ๐ ๏ธ Build the MedlinePlus corpus locally |
|
|
| A local-only ingest script lives at `scripts/ingest_medline.py`. It downloads the latest **MedlinePlus Health Topics XML** (public domain), chunks each topic summary, and embeds the chunks with MiniLM. |
|
|
| ```bash |
| python -m venv .venv && source .venv/bin/activate |
| pip install -U sentence-transformers numpy lxml |
| python scripts/ingest_medline.py |
| ``` |
|
|
| This produces: |
|
|
| - `data/corpus.jsonl` โ one chunk per line: `{id, topic, section, url, text}` |
| - `data/embeddings.npy` โ float32 matrix, L2-normalized, shape `(N, 384)` |
|
|
| Optional environment variables for the script: |
|
|
| | Variable | Purpose | |
| | --- | --- | |
| | `MEDLINE_XML_URL` | Pin a specific snapshot (e.g. `https://medlineplus.gov/xml/mplus_topics_YYYY-MM-DD.xml.zip`) | |
| | `EMBED_MODEL` | Override the embedding model | |
| | `CHUNK_TOKENS` | Chunk size in tokens (default `300`) | |
| | `CHUNK_OVERLAP` | Chunk overlap in tokens (default `50`) | |
|
|
| Then drag `data/corpus.jsonl` and `data/embeddings.npy` into the **Files** tab of this Space (under a top-level `data/` folder). The Space will pick them up on next restart. |
|
|
| --- |
|
|
| ## โ๏ธ Configuration |
|
|
| Optional environment variables / Space secrets: |
|
|
| | Variable | Default | Purpose | |
| | --- | --- | --- | |
| | `HF_TOKEN` | โ | Hugging Face token (needed for gated or private generation models) | |
| | `GEN_MODEL` | `meta-llama/Llama-3.1-8B-Instruct` | Override the hosted generation model | |
| | `EMBED_MODEL` | `sentence-transformers/all-MiniLM-L6-v2` | Override the embedding model | |
|
|
| --- |
|
|
| ## ๐บ๏ธ Roadmap |
|
|
| - [x] Lightweight publishable v0 with sample corpus |
| - [x] MedlinePlus ingest script + auto-load when uploaded |
| - [ ] Add additional public-domain / openly licensed corpora (CDC, NICE OGL, OpenStax) |
| - [ ] Move retrieval to a persistent vector store (e.g. Chroma) once the corpus grows |
| - [ ] Optional local GGUF inference on GPU hardware |
|
|
| --- |
|
|
| ## ๐ซ What this Space deliberately does **not** do |
|
|
| - It does **not** include or redistribute the *Merck Manuals* or any other restricted, paywalled, or copyrighted clinical reference content. |
| - It does **not** provide medical advice. |
|
|
| --- |
|
|
| ## ๐ Attribution |
|
|
| Health-topic content used by the prebuilt corpus is adapted from **MedlinePlus**, a service of the U.S. National Library of Medicine, National Institutes of Health. MedlinePlus content is in the public domain and free to reuse. |
|
|
| This project is **not affiliated with, endorsed by, or sponsored by** NLM, NIH, or HHS. |
|
|
| --- |
|
|
| ## ๐ License |
|
|
| Released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0). |
|
|