--- title: FetchMerck AI Demo emoji: 🩺 colorFrom: yellow colorTo: purple sdk: gradio sdk_version: 6.14.0 app_file: app.py pinned: false hf_oauth: true hf_oauth_scopes: - inference-api license: apache-2.0 short_description: Lightweight RAG demo for clinical decision support --- # 🩺 FetchMerck AI β€” Demo > A lightweight, public demonstration of a **Retrieval-Augmented Generation (RAG)** pipeline for clinical decision support. [![Gradio](https://img.shields.io/badge/Gradio-6.14.0-orange)](https://www.gradio.app/) [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0) [![Status](https://img.shields.io/badge/status-demo-yellow)]() --- ## ⚠️ Medical Disclaimer > **This Space is an educational prototype only.** > It is **not** a medical device and **must not** be used for diagnosis, treatment, triage, or any clinical decision-making. Outputs may be inaccurate or incomplete. **Always consult a licensed clinician** for medical questions. --- ## ✨ What it does FetchMerck AI demonstrates a minimal, end-to-end RAG pipeline you can read in a single afternoon: - Embeds your question with `sentence-transformers/all-MiniLM-L6-v2`. - Retrieves the top-k most similar passages from a small corpus via **cosine similarity over a NumPy matrix** (no vector DB needed). - Asks a hosted instruction-tuned LLM via the **Hugging Face Inference API** to answer **only** from the retrieved context. - Surfaces the source topic names alongside every answer, plus the standing disclaimer. It ships with two corpus modes: | Mode | When it activates | Source | | --- | --- | --- | | **MedlinePlus corpus** | When `data/corpus.jsonl` and `data/embeddings.npy` are present | NIH MedlinePlus Health Topics (public domain) | | **Tiny sample corpus** | Fallback, in-memory | Built-in, ensures the demo always boots | --- ## πŸ”¬ How it works 1. The user enters a clinical question. 2. The query is embedded with MiniLM and L2-normalized. 3. Cosine similarity is computed against the corpus matrix; the top-k passages are selected. 4. Those passages are concatenated into a grounded context window. 5. A hosted LLM is prompted to answer **only** from that context. 6. The answer is rendered with source topic names and the medical disclaimer. --- ## πŸ› οΈ Build the MedlinePlus corpus locally A local-only ingest script lives at `scripts/ingest_medline.py`. It downloads the latest **MedlinePlus Health Topics XML** (public domain), chunks each topic summary, and embeds the chunks with MiniLM. ```bash python -m venv .venv && source .venv/bin/activate pip install -U sentence-transformers numpy lxml python scripts/ingest_medline.py ``` This produces: - `data/corpus.jsonl` β€” one chunk per line: `{id, topic, section, url, text}` - `data/embeddings.npy` β€” float32 matrix, L2-normalized, shape `(N, 384)` Optional environment variables for the script: | Variable | Purpose | | --- | --- | | `MEDLINE_XML_URL` | Pin a specific snapshot (e.g. `https://medlineplus.gov/xml/mplus_topics_YYYY-MM-DD.xml.zip`) | | `EMBED_MODEL` | Override the embedding model | | `CHUNK_TOKENS` | Chunk size in tokens (default `300`) | | `CHUNK_OVERLAP` | Chunk overlap in tokens (default `50`) | Then drag `data/corpus.jsonl` and `data/embeddings.npy` into the **Files** tab of this Space (under a top-level `data/` folder). The Space will pick them up on next restart. --- ## βš™οΈ Configuration Optional environment variables / Space secrets: | Variable | Default | Purpose | | --- | --- | --- | | `HF_TOKEN` | β€” | Hugging Face token (needed for gated or private generation models) | | `GEN_MODEL` | `meta-llama/Llama-3.1-8B-Instruct` | Override the hosted generation model | | `EMBED_MODEL` | `sentence-transformers/all-MiniLM-L6-v2` | Override the embedding model | --- ## πŸ—ΊοΈ Roadmap - [x] Lightweight publishable v0 with sample corpus - [x] MedlinePlus ingest script + auto-load when uploaded - [ ] Add additional public-domain / openly licensed corpora (CDC, NICE OGL, OpenStax) - [ ] Move retrieval to a persistent vector store (e.g. Chroma) once the corpus grows - [ ] Optional local GGUF inference on GPU hardware --- ## 🚫 What this Space deliberately does **not** do - It does **not** include or redistribute the *Merck Manuals* or any other restricted, paywalled, or copyrighted clinical reference content. - It does **not** provide medical advice. --- ## πŸ“š Attribution Health-topic content used by the prebuilt corpus is adapted from **MedlinePlus**, a service of the U.S. National Library of Medicine, National Institutes of Health. MedlinePlus content is in the public domain and free to reuse. This project is **not affiliated with, endorsed by, or sponsored by** NLM, NIH, or HHS. --- ## πŸ“„ License Released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).