SentinelBrain-14B-MoE-v0.1 / reports /next_phase_training_plan.md
qubitpage's picture
Add SentinelBrain v2 realignment checkpoint and training report
10643b7 verified

SentinelBrain Next Training Phase Plan

Created: 2026-05-03

Current Result

The v2 realignment completed 5,000 steps and preserved full optimizer/progress state, but executable-code quality is not ready:

  • frankenstein_v2_best.pt: 0/8 Pass@1, 62.5% syntax rate.
  • frankenstein_v2_final.pt: 0/8 Pass@1, 75.0% syntax rate.
  • sentinelbrain_pretrain_step2471_hf.pt: 0/8 Pass@1, 87.5% syntax rate.

The next phase should not be another broad corpus realignment. It should be a narrow, measurable SFT and auto-critic loop focused on producing valid, executable assistant outputs.

Phase 3 Objective

Recover and improve instruction-following/code-generation behavior while preserving the useful realignment progress.

Primary gates:

  • Python stub benchmark: at least 40% Pass@1 and 95% syntax rate before extending past 1,000 steps.
  • MBPP/HumanEval sample: measurable improvement every eval window, no syntax regression.
  • Chat format probe: responses must use the requested format and stop cleanly.
  • Safety/data probe: no leaked secrets, no private-key blocks, no repetitive boilerplate.

Data Mix

Use the cleaned SFT dataset at /mnt/scratch/datasets/combined/sft_combined_ready.jsonl as the base, then rebalance before training:

  • 45% executable Python and TypeScript tasks: HumanEval-style stubs, MBPP-style prompts, unit-test repair, CLI scripts, API handlers.
  • 20% code editing and diff output: unified diffs, bug fixes, refactors, failing-test-to-patch examples.
  • 15% tool-use and agent workflows: file search, terminal commands, deployment diagnostics, function-call JSON.
  • 10% system/admin/devops: Linux, Docker, nginx, pm2, Azure, SSH, logs.
  • 10% general instruction/chat: concise natural language, summarization, planning.

Avoid over-weighting audio/prose/OCR rows in this phase. Keep those for a later multimodal run after chat/code behavior is stable.

Auto-Critic Pipeline

For each generated training candidate:

  1. Normalize to a strict prompt/response or ChatML schema.
  2. Run syntax checks for code outputs.
  3. Run unit tests when a test harness is available.
  4. Score format compliance: required function names, JSON validity, diff parseability, stop tokens.
  5. Reject outputs with repetition, generic filler, missing entry points, invalid tokens, or secret-like strings.
  6. Keep only examples that pass the critic or have a repair trajectory showing the failed output and corrected output.

Recommended critic labels:

  • syntax_pass
  • tests_pass
  • entrypoint_match
  • format_pass
  • no_secret
  • no_repetition
  • accepted

Train on accepted final answers plus curated repair traces, not raw failed generations.

Training Schedule

Start from sentinelbrain_pretrain_step2471_hf.pt or frankenstein_v2_best.pt only after a short format probe. If frankenstein_v2_best.pt continues to emit invalid boilerplate, use the pretrain anchor and reintroduce v2 weights later via low-LR distillation.

Suggested run:

  • Steps 0-250: frozen experts, train attention/router/norms and output behavior at low LR.
  • Steps 250-1,000: unfreeze selected layers, SFT on executable/code-repair mix.
  • Eval every 250 steps; save model-only checkpoints and full optimizer checkpoints at each gate.
  • Stop automatically if syntax rate drops for two evals or Pass@1 does not improve after 750 steps.
  • Continue to 3,000-5,000 only after hitting the 1,000-step gates.

Benchmark Set

Always run on the MI300X server:

  • Local 8-task smoke benchmark for quick regression checks.
  • HumanEval subset and MBPP subset with executable tests.
  • JSON/function-call validity suite.
  • Unified-diff parse/apply suite.
  • DevOps shell-command reasoning suite.

Persist every benchmark JSON under /mnt/scratch/benchmark_results/ and copy selected reports into the HF release folder.

Operational Rules

  • Keep full optimizer checkpoints on server/Azure, not local PC.
  • Store model-only best checkpoints separately for HF and chat loading.
  • Never expose chat using a checkpoint that fails the format probe unless clearly labeled experimental.
  • Use SENTINEL_WEB_CHAT_DISABLED=0 only after confirming no training job is active and VRAM has enough headroom.
  • Prefer frankenstein_v2_best.pt for validation-loss experiments, but prefer the pretrain anchor for code SFT if code probes remain better there.