jeremygracey-ai commited on
Commit
e3642e5
·
verified ·
1 Parent(s): 5f6d2d1

Rewrite README: lightweight RAG demo, sample corpus, medical disclaimer, no Merck content

Browse files
Files changed (1) hide show
  1. README.md +54 -3
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: FetchMerck AI Demo
3
- emoji: 💬
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
@@ -11,7 +11,58 @@ hf_oauth: true
11
  hf_oauth_scopes:
12
  - inference-api
13
  license: apache-2.0
14
- short_description: RAG clinical decision support from the Merck Manuals
15
  ---
16
 
17
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: FetchMerck AI Demo
3
+ emoji: 🩺
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
 
11
  hf_oauth_scopes:
12
  - inference-api
13
  license: apache-2.0
14
+ short_description: Lightweight RAG demo for clinical decision support
15
  ---
16
 
17
+ # FetchMerck AI Demo
18
+
19
+ A lightweight, public **demonstration** of a Retrieval-Augmented Generation (RAG)
20
+ pipeline for clinical decision support.
21
+
22
+ This Space uses:
23
+
24
+ - A small in-memory **sample corpus** of original, paraphrased clinical
25
+ reference snippets (no copyrighted source material).
26
+ - `sentence-transformers/all-MiniLM-L6-v2` for embeddings.
27
+ - Cosine-similarity retrieval over a NumPy matrix (no vector DB).
28
+ - A hosted generation model via the Hugging Face Inference API.
29
+
30
+ ## ⚠️ Medical Disclaimer
31
+
32
+ This Space is an **educational prototype only**. It is **not a medical device**
33
+ and must **not** be used for diagnosis, treatment, triage, or any clinical
34
+ decision-making. Outputs may be inaccurate or incomplete. Always consult a
35
+ licensed clinician for medical questions.
36
+
37
+ ## How it works
38
+
39
+ 1. The user enters a clinical question.
40
+ 2. The query is embedded and compared against the sample corpus by cosine similarity.
41
+ 3. The top-k passages are concatenated as grounded context.
42
+ 4. A hosted instruction-tuned LLM is asked to answer **only** from that context.
43
+ 5. The response is shown along with the source section names and a disclaimer.
44
+
45
+ ## Configuration
46
+
47
+ Optional environment variables / Space secrets:
48
+
49
+ - `HF_TOKEN` — Hugging Face token (needed only for gated or private generation models).
50
+ - `GEN_MODEL` — override the generation model (default: `meta-llama/Llama-3.1-8B-Instruct`).
51
+
52
+ ## Roadmap
53
+
54
+ This is the v0 publishable baseline. Planned upgrades, in order:
55
+
56
+ 1. Replace the sample corpus with a **legally publishable** medical reference
57
+ corpus (e.g., openly licensed clinical guidelines, public-domain references,
58
+ or content the project is licensed to redistribute).
59
+ 2. Move retrieval to a persistent vector store (e.g., Chroma) once the corpus grows.
60
+ 3. Pre-build and ship a vector index alongside the Space.
61
+ 4. Optionally add local GGUF inference on GPU hardware.
62
+
63
+ ## What this Space deliberately does **not** do
64
+
65
+ - It does **not** include or redistribute the Merck Manuals or any other
66
+ restricted, paywalled, or copyrighted clinical reference content.
67
+ - It does **not** persist user data; the in-memory index is rebuilt each cold start.
68
+ - It does **not** provide medical advice.