# Deployment Guide: Snapdragon X ARM Laptop (16GB RAM)

## Your Hardware
- Snapdragon X 10-core X1P64100 @ 3.40 GHz
- 16 GB RAM (ARM-based, no discrete GPU)
- Windows 11 ARM64

## What Works on Your Machine

### The Full Application (No GPU Needed)
```bash
git clone https://huggingface.co/nkshirsa/phd-research-os-brain
cd phd-research-os-brain
pip install gradio pymupdf
python -m phd_research_os_v2.app
```
This runs the entire UI: paper ingestion, section-aware parsing, claim extraction (heuristic mode), knowledge graph, conflict detection, scoring, Obsidian export. All on CPU.

### Local LLM Inference via Ollama
Ollama runs natively on Windows ARM64 since v0.1.29+.

```powershell
# Install Ollama (Windows ARM64 supported)
winget install Ollama.Ollama

# Pull a quantized model that fits in 16GB RAM
# Option 1: Qwen2.5 3B (Q4_K_M ≈ 2.5GB RAM)
ollama pull qwen2.5:3b

# Option 2: Phi-3 Mini (Q4_K_M ≈ 2.5GB RAM) - good at structured output
ollama pull phi3:mini

# Option 3: If you want to push it — Qwen2.5 7B (Q4_K_M ≈ 5GB RAM)
ollama pull qwen2.5:7b

# Verify it works
ollama run qwen2.5:3b "Extract claims from: The LOD was 0.8 fM."
```

RAM budget with Ollama running:
```
Windows + apps:     ~6 GB
Ollama + 3B model:  ~3 GB
Research OS app:    ~1 GB
ChromaDB:           ~0.5 GB
───────────────────────────
Total:              ~10.5 GB of 16 GB — comfortable
```

For 7B models: tighter (~13GB total) but workable if you close other apps.

### Connect Ollama to Research OS
Set this environment variable before launching:
```powershell
$env:OLLAMA_BASE_URL = "http://localhost:11434"
```

Or use API fallback (cloud, no local model needed):
```powershell
$env:ANTHROPIC_API_KEY = "sk-ant-..."
# OR
$env:OPENAI_API_KEY = "sk-..."
```

## What Requires Cloud

### Model Training
Your machine cannot train. Use one of:
1. **ZeroGPU Space**: [nkshirsa/phd-research-os-train](https://huggingface.co/spaces/nkshirsa/phd-research-os-train) — requires HF PRO ($9/month) for ZeroGPU access
2. **Google Colab**: Free T4 GPU. Upload `train.py` and run.
3. **Any cloud GPU**: Lambda, Vast.ai, RunPod — run `python train.py`

After training, the LoRA adapter gets pushed to Hub. Then pull it into Ollama:
```bash
# Convert trained adapter to GGUF (on a machine with GPU)
# Then: ollama create research-os-brain -f Modelfile
```

### Marker PDF Parser (Optional Upgrade)
Marker requires PyTorch which is heavy on ARM. The built-in PyMuPDF parser works fine for most papers. Install Marker only if you need better layout detection:
```bash
pip install marker-pdf  # May be slow on ARM without GPU
```

## Recommended Daily Workflow

1. **Start Ollama** (runs in background): `ollama serve`
2. **Start Research OS**: `python -m phd_research_os_v2.app`
3. **Open browser**: http://localhost:7860
4. **Phase 1**: Upload PDFs → system parses into sections
5. **Phase 2**: Extract claims (Ollama-powered or heuristic)
6. **Phase 3**: Build knowledge graph
7. **Phase 4**: Detect conflicts
8. **Phase 5**: Rescore with code-computed confidence
9. **Export**: Obsidian vault or CSV/JSON download