DEPLOYMENT_ARM.md · nkshirsa/phd-research-os-brain at main

phd-research-os-brain / DEPLOYMENT_ARM.md

nkshirsa

Add Snapdragon ARM deployment guide

a8c25ee verified 20 days ago

preview code

raw

history blame contribute delete

3.16 kB

	# Deployment Guide: Snapdragon X ARM Laptop (16GB RAM)

	## Your Hardware
	- Snapdragon X 10-core X1P64100 @ 3.40 GHz
	- 16 GB RAM (ARM-based, no discrete GPU)
	- Windows 11 ARM64

	## What Works on Your Machine

	### The Full Application (No GPU Needed)
	```bash
	git clone https://huggingface.co/nkshirsa/phd-research-os-brain
	cd phd-research-os-brain
	pip install gradio pymupdf
	python -m phd_research_os_v2.app
	```
	This runs the entire UI: paper ingestion, section-aware parsing, claim extraction (heuristic mode), knowledge graph, conflict detection, scoring, Obsidian export. All on CPU.

	### Local LLM Inference via Ollama
	Ollama runs natively on Windows ARM64 since v0.1.29+.

	```powershell
	# Install Ollama (Windows ARM64 supported)
	winget install Ollama.Ollama

	# Pull a quantized model that fits in 16GB RAM
	# Option 1: Qwen2.5 3B (Q4_K_M ≈ 2.5GB RAM)
	ollama pull qwen2.5:3b

	# Option 2: Phi-3 Mini (Q4_K_M ≈ 2.5GB RAM) - good at structured output
	ollama pull phi3:mini

	# Option 3: If you want to push it — Qwen2.5 7B (Q4_K_M ≈ 5GB RAM)
	ollama pull qwen2.5:7b

	# Verify it works
	ollama run qwen2.5:3b "Extract claims from: The LOD was 0.8 fM."
	```

	RAM budget with Ollama running:
	```
	Windows + apps: ~6 GB
	Ollama + 3B model: ~3 GB
	Research OS app: ~1 GB
	ChromaDB: ~0.5 GB
	───────────────────────────
	Total: ~10.5 GB of 16 GB — comfortable
	```

	For 7B models: tighter (~13GB total) but workable if you close other apps.

	### Connect Ollama to Research OS
	Set this environment variable before launching:
	```powershell
	$env:OLLAMA_BASE_URL = "http://localhost:11434"
	```

	Or use API fallback (cloud, no local model needed):
	```powershell
	$env:ANTHROPIC_API_KEY = "sk-ant-..."
	# OR
	$env:OPENAI_API_KEY = "sk-..."
	```

	## What Requires Cloud

	### Model Training
	Your machine cannot train. Use one of:
	1. ZeroGPU Space: [nkshirsa/phd-research-os-train](https://huggingface.co/spaces/nkshirsa/phd-research-os-train) — requires HF PRO ($9/month) for ZeroGPU access
	2. Google Colab: Free T4 GPU. Upload `train.py` and run.
	3. Any cloud GPU: Lambda, Vast.ai, RunPod — run `python train.py`

	After training, the LoRA adapter gets pushed to Hub. Then pull it into Ollama:
	```bash
	# Convert trained adapter to GGUF (on a machine with GPU)
	# Then: ollama create research-os-brain -f Modelfile
	```

	### Marker PDF Parser (Optional Upgrade)
	Marker requires PyTorch which is heavy on ARM. The built-in PyMuPDF parser works fine for most papers. Install Marker only if you need better layout detection:
	```bash
	pip install marker-pdf # May be slow on ARM without GPU
	```

	## Recommended Daily Workflow

	1. Start Ollama (runs in background): `ollama serve`
	2. Start Research OS: `python -m phd_research_os_v2.app`
	3. Open browser: http://localhost:7860
	4. Phase 1: Upload PDFs → system parses into sections
	5. Phase 2: Extract claims (Ollama-powered or heuristic)
	6. Phase 3: Build knowledge graph
	7. Phase 4: Detect conflicts
	8. Phase 5: Rescore with code-computed confidence
	9. Export: Obsidian vault or CSV/JSON download