Text Generation
llama-cpp-python
GGUF
English
rag
healthcare
clinical-decision-support
medical
merck-manual
retrieval-augmented-generation
mistral
Instructions to use jeremygracey-ai/FetchMerck_AI with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use jeremygracey-ai/FetchMerck_AI with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jeremygracey-ai/FetchMerck_AI", filename="mistral-7b-instruct-v0.1.Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- llama-cpp-python
How to use jeremygracey-ai/FetchMerck_AI with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jeremygracey-ai/FetchMerck_AI", filename="mistral-7b-instruct-v0.1.Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use jeremygracey-ai/FetchMerck_AI with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M # Run inference directly in the terminal: llama-cli -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M # Run inference directly in the terminal: llama-cli -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf jeremygracey-ai/FetchMerck_AI:Q4_K_M
Use Docker
docker model run hf.co/jeremygracey-ai/FetchMerck_AI:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use jeremygracey-ai/FetchMerck_AI with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jeremygracey-ai/FetchMerck_AI" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jeremygracey-ai/FetchMerck_AI", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jeremygracey-ai/FetchMerck_AI:Q4_K_M
- Ollama
How to use jeremygracey-ai/FetchMerck_AI with Ollama:
ollama run hf.co/jeremygracey-ai/FetchMerck_AI:Q4_K_M
- Unsloth Studio new
How to use jeremygracey-ai/FetchMerck_AI with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jeremygracey-ai/FetchMerck_AI to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jeremygracey-ai/FetchMerck_AI to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jeremygracey-ai/FetchMerck_AI to start chatting
- Docker Model Runner
How to use jeremygracey-ai/FetchMerck_AI with Docker Model Runner:
docker model run hf.co/jeremygracey-ai/FetchMerck_AI:Q4_K_M
- Lemonade
How to use jeremygracey-ai/FetchMerck_AI with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jeremygracey-ai/FetchMerck_AI:Q4_K_M
Run and chat with the model
lemonade run user.FetchMerck_AI-Q4_K_M
List all available models
lemonade list
Update README.md
Browse files
README.md
CHANGED
|
@@ -17,80 +17,9 @@ pipeline_tag: text-generation
|
|
| 17 |
datasets:
|
| 18 |
- custom
|
| 19 |
library_name: llama-cpp-python
|
|
|
|
| 20 |
|
| 21 |
# FetchMerck_AI
|
| 22 |
|
| 23 |
**A RAG-based clinical decision support system powered by the Merck Manuals**
|
| 24 |
|
| 25 |
-
FetchMerck_AI is a Retrieval-Augmented Generation (RAG) solution designed to help healthcare providers streamline clinical decision-making by surfacing relevant medical knowledge from the Merck Manuals in real time. The system retrieves contextually relevant passages from over 4,000 pages of medical reference content spanning 23 clinical sections, then generates grounded, citation-backed responses using a quantized Mistral-7B model.
|
| 26 |
-
|
| 27 |
-
## Key Objectives
|
| 28 |
-
|
| 29 |
-
- **Streamline clinical decision-making** β Surface relevant diagnostic and treatment information at the point of care
|
| 30 |
-
- **Analyze impact on diagnostics and patient outcomes** β Evaluate how RAG-assisted retrieval affects clinical reasoning quality
|
| 31 |
-
- **Standardize care practices** β Leverage a trusted, evidence-based reference to reduce variation in clinical decisions
|
| 32 |
-
- **Demonstrate feasibility** β Provide a functional prototype showing real-world applicability of RAG in healthcare settings
|
| 33 |
-
|
| 34 |
-
## Architecture
|
| 35 |
-
|
| 36 |
-
| Component | Details |
|
| 37 |
-
|-----------|---------|
|
| 38 |
-
| **LLM** | Mistral-7B-v0.1 (GGUF quantized) |
|
| 39 |
-
| **Retrieval** | RAG pipeline over vectorized Merck Manual content |
|
| 40 |
-
| **Knowledge Base** | Merck Manuals β 4,000+ page PDF covering 23 medical sections (disorders, diagnostics, drugs, tests) |
|
| 41 |
-
| **Framework** | LangChain + llama-cpp-python |
|
| 42 |
-
|
| 43 |
-
### How It Works
|
| 44 |
-
|
| 45 |
-
1. **Document Ingestion** β The Merck Manual PDF is chunked and embedded into a vector store
|
| 46 |
-
2. **Query Processing** β A provider's clinical question is embedded and matched against the knowledge base
|
| 47 |
-
3. **Contextual Retrieval** β The most relevant passages are retrieved with source attribution
|
| 48 |
-
4. **Grounded Generation** β Mistral-7B generates a response grounded in the retrieved evidence, reducing hallucination risk
|
| 49 |
-
|
| 50 |
-
## About the Merck Manuals
|
| 51 |
-
|
| 52 |
-
The [Merck Manuals](https://www.merckmanuals.com/) are medical reference books published by the American pharmaceutical company Merck & Co. since 1899. They cover a comprehensive range of medical topics including disorders, tests, diagnoses, and drugs across 23 clinical sections. The manuals are widely regarded as one of the most trusted general medical references available.
|
| 53 |
-
|
| 54 |
-
## Intended Use
|
| 55 |
-
|
| 56 |
-
- **Primary users:** Healthcare providers, clinical researchers, medical educators
|
| 57 |
-
- **Use case:** Point-of-care decision support, clinical education, care standardization research
|
| 58 |
-
- **Setting:** Research and prototyping β not intended for production clinical deployment without further validation
|
| 59 |
-
|
| 60 |
-
## Limitations
|
| 61 |
-
|
| 62 |
-
- This is a **research prototype** demonstrating RAG feasibility in healthcare; it has not been validated for clinical production use
|
| 63 |
-
- Responses are grounded in the Merck Manual content and may not reflect the latest clinical guidelines or institution-specific protocols
|
| 64 |
-
- The system should augment β never replace β clinical judgment
|
| 65 |
-
- Performance depends on retrieval quality; edge cases or highly specialized queries may yield suboptimal results
|
| 66 |
-
|
| 67 |
-
## Ethical Considerations
|
| 68 |
-
|
| 69 |
-
- **Patient safety:** This tool is designed as a decision *support* system, not an autonomous diagnostic agent
|
| 70 |
-
- **Bias:** The knowledge base reflects the scope and perspective of the Merck Manuals; providers should cross-reference with additional sources for complex cases
|
| 71 |
-
- **Privacy:** The system processes queries only β no patient data is stored or transmitted
|
| 72 |
-
|
| 73 |
-
## Citation
|
| 74 |
-
|
| 75 |
-
If you use FetchMerck_AI in your research, please cite:
|
| 76 |
-
|
| 77 |
-
```bibtex
|
| 78 |
-
@misc{gracey2026fetchmerck,
|
| 79 |
-
title={FetchMerck_AI: RAG-Based Clinical Decision Support Using the Merck Manuals},
|
| 80 |
-
author={Gracey, Jeremy},
|
| 81 |
-
year={2026},
|
| 82 |
-
publisher={Hugging Face},
|
| 83 |
-
doi={10.57967/hf/8101},
|
| 84 |
-
url={https://huggingface.co/jeremygracey-ai/FetchMerck_AI}
|
| 85 |
-
}
|
| 86 |
-
```
|
| 87 |
-
|
| 88 |
-
## Author
|
| 89 |
-
|
| 90 |
-
**Jeremy Gracey** β Clinical healthcare professional (8+ years) transitioning into healthcare AI/ML. Currently completing an AI/ML certificate at UT Austin McCombs School of Business.
|
| 91 |
-
|
| 92 |
-
- Hugging Face: [@jeremygracey-ai](https://huggingface.co/jeremygracey-ai)
|
| 93 |
-
- Background: Anesthesia Technician & Psychiatric Technician β AI/ML Engineer
|
| 94 |
-
- Focus: Building AI systems that bridge the gap between clinical frontline experience and modern ML infrastructure
|
| 95 |
-
|
| 96 |
-
|
|
|
|
| 17 |
datasets:
|
| 18 |
- custom
|
| 19 |
library_name: llama-cpp-python
|
| 20 |
+
---
|
| 21 |
|
| 22 |
# FetchMerck_AI
|
| 23 |
|
| 24 |
**A RAG-based clinical decision support system powered by the Merck Manuals**
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|