Spaces:

jeremygracey-ai
/

FetchMerck-AI-Demo

Running

App Files Files Community

FetchMerck-AI-Demo / README.md

jeremygracey-ai

Bump Gradio SDK to 6.14.0

f1a917c verified 7 days ago

preview code

raw

history blame contribute delete

4.9 kB

	---
	title: FetchMerck AI Demo
	emoji: 🩺
	colorFrom: yellow
	colorTo: purple
	sdk: gradio
	sdk_version: 6.14.0
	app_file: app.py
	pinned: false
	hf_oauth: true
	hf_oauth_scopes:
	- inference-api
	license: apache-2.0
	short_description: Lightweight RAG demo for clinical decision support
	---

	# 🩺 FetchMerck AI — Demo

	> A lightweight, public demonstration of a Retrieval-Augmented Generation (RAG) pipeline for clinical decision support.

	[![Gradio](https://img.shields.io/badge/Gradio-6.14.0-orange)](https://www.gradio.app/)
	[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0)
	[![Status](https://img.shields.io/badge/status-demo-yellow)]()

	---

	## ⚠️ Medical Disclaimer

	> This Space is an educational prototype only.
	> It is not a medical device and must not be used for diagnosis, treatment, triage, or any clinical decision-making. Outputs may be inaccurate or incomplete. Always consult a licensed clinician for medical questions.

	---

	## ✨ What it does

	FetchMerck AI demonstrates a minimal, end-to-end RAG pipeline you can read in a single afternoon:

	- Embeds your question with `sentence-transformers/all-MiniLM-L6-v2`.
	- Retrieves the top-k most similar passages from a small corpus via cosine similarity over a NumPy matrix (no vector DB needed).
	- Asks a hosted instruction-tuned LLM via the Hugging Face Inference API to answer only from the retrieved context.
	- Surfaces the source topic names alongside every answer, plus the standing disclaimer.

	It ships with two corpus modes:

	\| Mode \| When it activates \| Source \|
	\| --- \| --- \| --- \|
	\| MedlinePlus corpus \| When `data/corpus.jsonl` and `data/embeddings.npy` are present \| NIH MedlinePlus Health Topics (public domain) \|
	\| Tiny sample corpus \| Fallback, in-memory \| Built-in, ensures the demo always boots \|

	---

	## 🔬 How it works

	1. The user enters a clinical question.
	2. The query is embedded with MiniLM and L2-normalized.
	3. Cosine similarity is computed against the corpus matrix; the top-k passages are selected.
	4. Those passages are concatenated into a grounded context window.
	5. A hosted LLM is prompted to answer only from that context.
	6. The answer is rendered with source topic names and the medical disclaimer.

	---

	## 🛠️ Build the MedlinePlus corpus locally

	A local-only ingest script lives at `scripts/ingest_medline.py`. It downloads the latest MedlinePlus Health Topics XML (public domain), chunks each topic summary, and embeds the chunks with MiniLM.

	```bash
	python -m venv .venv && source .venv/bin/activate
	pip install -U sentence-transformers numpy lxml
	python scripts/ingest_medline.py
	```

	This produces:

	- `data/corpus.jsonl` — one chunk per line: `{id, topic, section, url, text}`
	- `data/embeddings.npy` — float32 matrix, L2-normalized, shape `(N, 384)`

	Optional environment variables for the script:

	\| Variable \| Purpose \|
	\| --- \| --- \|
	\| `MEDLINE_XML_URL` \| Pin a specific snapshot (e.g. `https://medlineplus.gov/xml/mplus_topics_YYYY-MM-DD.xml.zip`) \|
	\| `EMBED_MODEL` \| Override the embedding model \|
	\| `CHUNK_TOKENS` \| Chunk size in tokens (default `300`) \|
	\| `CHUNK_OVERLAP` \| Chunk overlap in tokens (default `50`) \|

	Then drag `data/corpus.jsonl` and `data/embeddings.npy` into the Files tab of this Space (under a top-level `data/` folder). The Space will pick them up on next restart.

	---

	## ⚙️ Configuration

	Optional environment variables / Space secrets:

	\| Variable \| Default \| Purpose \|
	\| --- \| --- \| --- \|
	\| `HF_TOKEN` \| — \| Hugging Face token (needed for gated or private generation models) \|
	\| `GEN_MODEL` \| `meta-llama/Llama-3.1-8B-Instruct` \| Override the hosted generation model \|
	\| `EMBED_MODEL` \| `sentence-transformers/all-MiniLM-L6-v2` \| Override the embedding model \|

	---

	## 🗺️ Roadmap

	- [x] Lightweight publishable v0 with sample corpus
	- [x] MedlinePlus ingest script + auto-load when uploaded
	- [ ] Add additional public-domain / openly licensed corpora (CDC, NICE OGL, OpenStax)
	- [ ] Move retrieval to a persistent vector store (e.g. Chroma) once the corpus grows
	- [ ] Optional local GGUF inference on GPU hardware

	---

	## 🚫 What this Space deliberately does not do

	- It does not include or redistribute the Merck Manuals or any other restricted, paywalled, or copyrighted clinical reference content.
	- It does not provide medical advice.

	---

	## 📚 Attribution

	Health-topic content used by the prebuilt corpus is adapted from MedlinePlus, a service of the U.S. National Library of Medicine, National Institutes of Health. MedlinePlus content is in the public domain and free to reuse.

	This project is not affiliated with, endorsed by, or sponsored by NLM, NIH, or HHS.

	---

	## 📄 License

	Released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).