Add v0.9.4 update section: six differentiation bets
Browse filesREADME sync for the v0.9.4 release (https://github.com/joemunene-by/GhostLM/releases/tag/v0.9.4). Adds the six-bets table near the top with three measured results (bet 3 settled at +4%, bet 5 MoE smoke PASS, bet 6 baseline 0/8 with eval harness). Updates citation note to v0.9.4. Refreshes roadmap to describe how the bets compose on top of ghost-base when GPU lands. v0.9 chat checkpoint itself is unchanged; bench numbers are intact.
README.md
CHANGED
|
@@ -76,6 +76,29 @@ parameters the model has the *register* of cybersec writing but not the
|
|
| 76 |
This repo holds the slim inference checkpoint
|
| 77 |
(`best_model.pt`, 324 MB, model + config only, optimizer state stripped).
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
## Bench numbers
|
| 80 |
|
| 81 |
All benches run with debiased multi-permutation text-scoring on
|
|
@@ -224,7 +247,7 @@ output rather than reliable answers.
|
|
| 224 |
author = {Munene, Joe},
|
| 225 |
year = {2026},
|
| 226 |
howpublished = {\url{https://github.com/joemunene-by/GhostLM}},
|
| 227 |
-
note = {v0.9.
|
| 228 |
}
|
| 229 |
```
|
| 230 |
|
|
@@ -242,6 +265,14 @@ The fact-recall bar is the truth metric. Spec at
|
|
| 242 |
multi-year pathway through ghost-7B in
|
| 243 |
[`docs/hardware_pathway.md`](https://github.com/joemunene-by/GhostLM/blob/main/docs/hardware_pathway.md).
|
| 244 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 245 |
## License
|
| 246 |
|
| 247 |
Apache 2.0. Same license as the GhostLM source code.
|
|
|
|
| 76 |
This repo holds the slim inference checkpoint
|
| 77 |
(`best_model.pt`, 324 MB, model + config only, optimizer state stripped).
|
| 78 |
|
| 79 |
+
## v0.9.4 update (2026-05-08): six differentiation bets
|
| 80 |
+
|
| 81 |
+
The v0.9.4 release adds six concrete bets to make GhostLM not-another-
|
| 82 |
+
point-on-the-small-cybersec-LM-plot. Each has a runnable scaffold in
|
| 83 |
+
the GitHub repo. Three are already measured. Strategic frame:
|
| 84 |
+
[`docs/differentiation.md`](https://github.com/joemunene-by/GhostLM/blob/main/docs/differentiation.md).
|
| 85 |
+
|
| 86 |
+
| Bet | Status | Result |
|
| 87 |
+
|---|---|---|
|
| 88 |
+
| 1. Tool-grounded SFT | scaffolded | scripts/distill_tool_use.py, ~$200 budget |
|
| 89 |
+
| 2. Daily LoRA over fresh threat-intel | scaffolded | scripts/daily_finetune.py, ~1-2 GPU hr/day |
|
| 90 |
+
| 3. Custom 32K BPE | **measured + settled** | +4.0% on cyber, -2.5% on general vs GPT-2 BPE; +25-35% projection falsified, see [bpe_corpus_ablation.md](https://github.com/joemunene-by/GhostLM/blob/main/docs/bpe_corpus_ablation.md) |
|
| 91 |
+
| 4. Long context via RoPE NTK | scaffolded | scripts/extend_context_ntk.py, ~3-5 GPU hr |
|
| 92 |
+
| 5. MoE for ghost-1B+ | **smoke validated** | 100-step training PASS, see [moe_training_smoke.md](https://github.com/joemunene-by/GhostLM/blob/main/docs/moe_training_smoke.md); presets `ghost-1b` (2.1B/1.2B-active) and `ghost-3b` (6.0B/3.3B-active) |
|
| 93 |
+
| 6. Format-aware pretrain (STIX/YARA/Sigma/MISP) | **end-to-end measurable** | v0.9 baseline locked at 0/8 = 0%, see [format_baseline_v09.md](https://github.com/joemunene-by/GhostLM/blob/main/docs/format_baseline_v09.md); 560 templated training records ready, see [format_synth.md](https://github.com/joemunene-by/GhostLM/blob/main/docs/format_synth.md) |
|
| 94 |
+
|
| 95 |
+
The strategic claim isn't that any one bet definitely works; it's
|
| 96 |
+
that the **combination** of six reasonable bets gives GhostLM a
|
| 97 |
+
defensible identity that parameter-scale-only roadmaps don't.
|
| 98 |
+
|
| 99 |
+
The v0.9 chat checkpoint in this repo is unchanged; it's the
|
| 100 |
+
baseline against which the bet measurements are made.
|
| 101 |
+
|
| 102 |
## Bench numbers
|
| 103 |
|
| 104 |
All benches run with debiased multi-permutation text-scoring on
|
|
|
|
| 247 |
author = {Munene, Joe},
|
| 248 |
year = {2026},
|
| 249 |
howpublished = {\url{https://github.com/joemunene-by/GhostLM}},
|
| 250 |
+
note = {v0.9.4 release; 81M-parameter chat checkpoint plus six differentiation bets}
|
| 251 |
}
|
| 252 |
```
|
| 253 |
|
|
|
|
| 265 |
multi-year pathway through ghost-7B in
|
| 266 |
[`docs/hardware_pathway.md`](https://github.com/joemunene-by/GhostLM/blob/main/docs/hardware_pathway.md).
|
| 267 |
|
| 268 |
+
After ghost-base lands, the v0.9.4 differentiation bets compose on
|
| 269 |
+
top of it: tool-use SFT (bet 1) on the fresh ghost-base, format-aware
|
| 270 |
+
pretrain mix (bet 6) using the 560 templated records plus
|
| 271 |
+
LLM-distilled traces, RoPE NTK context extension to 16K (bet 4), and
|
| 272 |
+
eventually ghost-1B with native MoE from step 0 (bet 5). Sequencing
|
| 273 |
+
detail in
|
| 274 |
+
[`docs/differentiation.md`](https://github.com/joemunene-by/GhostLM/blob/main/docs/differentiation.md).
|
| 275 |
+
|
| 276 |
## License
|
| 277 |
|
| 278 |
Apache 2.0. Same license as the GhostLM source code.
|