Add ECC Harness: phd_research_os/ARCHITECTURE.md
Browse files
phd_research_os/ARCHITECTURE.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# PhD Research OS β Architecture Map
|
| 2 |
+
|
| 3 |
+
> **WAKE-UP INSTRUCTION**: This file is the ground truth for all file locations,
|
| 4 |
+
> API configurations, and system topology. Read this FIRST before touching anything.
|
| 5 |
+
|
| 6 |
+
## System Topology
|
| 7 |
+
|
| 8 |
+
```
|
| 9 |
+
phd-research-os-brain/
|
| 10 |
+
βββ ARCHITECTURE.md β YOU ARE HERE (project map)
|
| 11 |
+
βββ AGENTS.md β Agent role registry & contracts
|
| 12 |
+
βββ MEMORY.md β Persistent cross-session state
|
| 13 |
+
βββ plan.md β Current task plan (mutable)
|
| 14 |
+
βββ HARNESS_EVOLUTION.md β ECC rule amendments log
|
| 15 |
+
β
|
| 16 |
+
βββ train.py β SFT training script (Qwen2.5-3B + QLoRA)
|
| 17 |
+
βββ generate_dataset.py β Synthetic dataset generator (1900 examples)
|
| 18 |
+
β
|
| 19 |
+
βββ phd_research_os/ β CORE PACKAGE
|
| 20 |
+
β βββ __init__.py β v1.0.0
|
| 21 |
+
β βββ db.py β SQLite data layer (Phase 0)
|
| 22 |
+
β β Tables: claims, sources, goals, conflicts,
|
| 23 |
+
β β decisions, overrides, experiments,
|
| 24 |
+
β β api_usage_log, calibration_log, embedding_cache
|
| 25 |
+
β β + companion_agents, agent_tasks, agent_audit_log
|
| 26 |
+
β βββ agents.py β Original 6-role AI brain (ResearchOSBrain)
|
| 27 |
+
β βββ agent_os.py β ECC HARNESS ORCHESTRATOR (companion AI factory)
|
| 28 |
+
β β CompanionAgent lifecycle: spawn β preflight β
|
| 29 |
+
β β plan β execute β postflight β retire
|
| 30 |
+
β βββ pipeline.py β Paper ingestion (PDF β claims)
|
| 31 |
+
β βββ obsidian_export.py β Obsidian vault export (one-directional)
|
| 32 |
+
β βββ evaluation.py β Golden dataset eval harness + regression gate
|
| 33 |
+
β βββ conflict_detector.py β Pairwise contradiction detection
|
| 34 |
+
β βββ backup.py β SQLite backup & restore
|
| 35 |
+
β
|
| 36 |
+
βββ tests/
|
| 37 |
+
β βββ test_db.py β 22 unit tests (data layer)
|
| 38 |
+
β βββ test_agent_os.py β ECC harness integration tests
|
| 39 |
+
β
|
| 40 |
+
βββ output/
|
| 41 |
+
βββ research_os/
|
| 42 |
+
βββ db.py β Symlink/alias β phd_research_os/db.py
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
## API Configuration
|
| 46 |
+
|
| 47 |
+
| Provider | Env Variable | Default Model | Use Case |
|
| 48 |
+
|----------|-------------|---------------|----------|
|
| 49 |
+
| Anthropic | `ANTHROPIC_API_KEY` | claude-sonnet-4-20250514 | Primary brain + companion agents |
|
| 50 |
+
| OpenAI | `OPENAI_API_KEY` | gpt-4o-mini | Fallback |
|
| 51 |
+
| OpenRouter | `OPENROUTER_API_KEY` | (configurable) | Multi-model companion agents |
|
| 52 |
+
| HF Local | (model path) | nkshirsa/phd-research-os-brain | Fine-tuned local inference |
|
| 53 |
+
|
| 54 |
+
## Database Schema (db.py)
|
| 55 |
+
|
| 56 |
+
### Core Tables (Research OS)
|
| 57 |
+
- `claims` β Scientific claims with fixed-point confidence (Γ1000)
|
| 58 |
+
- `sources` β Paper metadata (DOI, journal tier, study type)
|
| 59 |
+
- `goals` β Research goals with priority ordering
|
| 60 |
+
- `conflicts` β Claim contradiction pairs (hypothesis_confidence ALWAYS "low")
|
| 61 |
+
- `decisions` β Proposed research actions with info gain scores
|
| 62 |
+
- `overrides` β Expert confidence overrides (lock mechanism)
|
| 63 |
+
- `experiments` β Lab data objects (manual approval required)
|
| 64 |
+
- `api_usage_log` β Cost tracking per API call
|
| 65 |
+
- `calibration_log` β Brier score data collection
|
| 66 |
+
- `embedding_cache` β Semantic cache (text_hash β embedding)
|
| 67 |
+
|
| 68 |
+
### ECC Harness Tables (agent_os.py)
|
| 69 |
+
- `companion_agents` β Registry of spawned companion AIs
|
| 70 |
+
- `agent_tasks` β Task lifecycle tracking (preflight β done)
|
| 71 |
+
- `agent_audit_log` β Every action, decision, and deviation logged
|
| 72 |
+
|
| 73 |
+
## Key Invariants (NEVER violate)
|
| 74 |
+
|
| 75 |
+
1. **Fixed-Point Math**: All probabilities stored as INTEGER Γ 1000. No floats in DB.
|
| 76 |
+
2. **Provenance**: All AI output is Level 5 (LLM Hypothesis). Human required for promotion.
|
| 77 |
+
3. **Hypothesis Confidence**: Conflict hypotheses are ALWAYS "low". Never auto-promote.
|
| 78 |
+
4. **Expert Override**: Once set, system cannot overwrite. Only human can change.
|
| 79 |
+
5. **Schema Version**: Every record carries `schema_version` tag.
|
| 80 |
+
6. **Companion Agent Isolation**: Companions cannot modify claims directly β they propose, humans approve.
|