Speed up CPU inference: halve token limits, pre-download models, fix OMP threads 4af4003 cgoodmaker Claude Opus 4.6 commited on Mar 2
Use bfloat16 on CPU to halve memory (8GB vs 16GB float32) 0989643 cgoodmaker Claude Opus 4.6 commited on Feb 26
Fix MCP subprocess deadlock: use stderr=None instead of PIPE da343a7 cgoodmaker Claude Opus 4.6 commited on Feb 23
Add timeout and stderr logging to MCP subprocess to debug tool hangs c376e14 cgoodmaker Claude Opus 4.6 commited on Feb 23
Remove unused files: old Gradio frontend, dead model code, test artifacts 672ed11 cgoodmaker Claude Opus 4.6 commited on Feb 23
Force MCP tool models to CPU to avoid GPU VRAM contention with MedGemma 1a97904 cgoodmaker Claude Opus 4.6 commited on Feb 23
Add missing deps: sentence-transformers, pdfplumber, scipy for MCP tools 5157ba3 cgoodmaker Claude Opus 4.6 commited on Feb 23
Add RAG Phase 4 management guidance, rebuild guidelines index (286 chunks), post-analysis hint UI 5241b71 cgoodmaker Claude Opus 4.6 commited on Feb 23
Use dtype instead of deprecated torch_dtype in model_kwargs 82f82ac cgoodmaker Claude Opus 4.6 commited on Feb 23
Redesign chat UI and fix MedGemma generation config issues 58a4476 cgoodmaker Claude Opus 4.6 commited on Feb 23
Fix requirements: opencv-headless (no libGL needed), remove unused gradio, add faiss-cpu and huggingface_hub 4cbee96 cgoodmaker commited on Feb 21
Include mcp_server/ in Docker image — required at runtime for tool calls 2068e4f cgoodmaker commited on Feb 21
Fix Dockerfile: install curl before using it for NodeSource setup 076039f cgoodmaker commited on Feb 21