Add SentinelBrain v2 realignment checkpoint and training report
Browse filesAdds the 5,000-step Frankenstein v2 realignment artifacts, benchmark report, dataset preflight report, and updated model card. Previous main revision preserved on branch previous-before-v2-realign-5000-20260503-103121.
- MANIFEST.tsv +16 -0
- README.md +57 -827
- SHA256SUMS +16 -0
- SHA256SUMS.quick +4 -0
- checkpoints/frankenstein_v2_best.pt +3 -0
- checkpoints/frankenstein_v2_ema_best.pt +3 -0
- checkpoints/frankenstein_v2_final.pt +3 -0
- checkpoints/sentinelbrain_pretrain_step2471_hf.pt +3 -0
- logs/realign_v2.log +0 -0
- logs/realign_v2_metrics.jsonl +0 -0
- reports/conductor_state_final.json +1 -0
- reports/next_phase_training_plan.md +91 -0
- reports/sentinelbrain_quality_smoke.json +138 -0
- reports/sentinelbrain_quality_stub_full_fixed.json +448 -0
- reports/sft_combined_ready_report.json +101 -0
- reports/sft_combined_ready_report.md +44 -0
- reports/train_metrics_final.json +1 -0
MANIFEST.tsv
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
./MANIFEST.tsv 0
|
| 2 |
+
./README.md 4321
|
| 3 |
+
./SHA256SUMS 1303
|
| 4 |
+
./checkpoints/frankenstein_v2_best.pt 28808034875
|
| 5 |
+
./checkpoints/frankenstein_v2_ema_best.pt 28808008759
|
| 6 |
+
./checkpoints/frankenstein_v2_final.pt 86422580021
|
| 7 |
+
./checkpoints/sentinelbrain_pretrain_step2471_hf.pt 29629503600
|
| 8 |
+
./logs/realign_v2.log 99633
|
| 9 |
+
./logs/realign_v2_metrics.jsonl 128854
|
| 10 |
+
./reports/conductor_state_final.json 511
|
| 11 |
+
./reports/next_phase_training_plan.md 4428
|
| 12 |
+
./reports/sentinelbrain_quality_smoke.json 6255
|
| 13 |
+
./reports/sentinelbrain_quality_stub_full_fixed.json 45321
|
| 14 |
+
./reports/sft_combined_ready_report.json 3330
|
| 15 |
+
./reports/sft_combined_ready_report.md 752
|
| 16 |
+
./reports/train_metrics_final.json 339
|
README.md
CHANGED
|
@@ -1,857 +1,87 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
-
- ro
|
| 6 |
-
- multilingual
|
| 7 |
tags:
|
| 8 |
- sentinelbrain
|
| 9 |
-
- mixture-of-experts
|
| 10 |
-
- from-scratch
|
| 11 |
-
- consciousness
|
| 12 |
-
- amd
|
| 13 |
-
- mi300x
|
| 14 |
-
- rocm
|
| 15 |
- moe
|
| 16 |
-
-
|
| 17 |
-
-
|
| 18 |
-
-
|
| 19 |
-
-
|
| 20 |
-
-
|
| 21 |
-
pipeline_tag: text-generation
|
| 22 |
library_name: pytorch
|
| 23 |
-
|
| 24 |
-
- HuggingFaceFW/fineweb-edu
|
| 25 |
-
- open-web-math/open-web-math
|
| 26 |
-
- wikimedia/wikipedia
|
| 27 |
-
- HuggingFaceTB/cosmopedia
|
| 28 |
-
- JeanKaddworr/minipile
|
| 29 |
-
- codeparrot/github-code-clean
|
| 30 |
-
- arxiv-community/arxiv-abstracts
|
| 31 |
-
model-index:
|
| 32 |
-
- name: SentinelBrain-14B-MoE-v0.1
|
| 33 |
-
results:
|
| 34 |
-
- task:
|
| 35 |
-
type: text-generation
|
| 36 |
-
metrics:
|
| 37 |
-
- name: Validation Loss (pretrain)
|
| 38 |
-
type: loss
|
| 39 |
-
value: 2.5152
|
| 40 |
-
verified: true
|
| 41 |
-
- name: Validation Loss (realignment v2 best)
|
| 42 |
-
type: loss
|
| 43 |
-
value: 7.5178
|
| 44 |
-
verified: true
|
| 45 |
-
- name: Training Loss (realignment v2 latest)
|
| 46 |
-
type: loss
|
| 47 |
-
value: 6.96
|
| 48 |
-
verified: true
|
| 49 |
-
- name: Validation Loss (realignment v1 best, abandoned)
|
| 50 |
-
type: loss
|
| 51 |
-
value: 5.773
|
| 52 |
-
verified: true
|
| 53 |
-
---
|
| 54 |
-
|
| 55 |
-
<div align="center">
|
| 56 |
-
|
| 57 |
-
# 🧠 Sentinel Prime — SentinelBrain-14B-MoE (Frankenstein Edition)
|
| 58 |
-
|
| 59 |
-
### *The First of His Kind, Rebuilt From the Inside Out*
|
| 60 |
-
|
| 61 |
-
<img src="assets/sentinel_frankenstein_banner.png" alt="Sentinel Prime — Frankenstein Edition" width="600"/>
|
| 62 |
-
|
| 63 |
-
**14.8 Billion Parameters · Mixture-of-Experts · Consciousness-Monitored · Frankenstein Transplant**
|
| 64 |
-
|
| 65 |
-
Trained from zero on **AMD Instinct MI300X** (192 GB HBM3) · ROCm 7.0 · Knowledge transplanted from Qwen-72B
|
| 66 |
-
|
| 67 |
-
[](https://sentinel.qubitpage.com/)
|
| 68 |
-
[](https://sentinel.qubitpage.com/whitepaper)
|
| 69 |
-
[](https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html)
|
| 70 |
-
[](LICENSE)
|
| 71 |
-
|
| 72 |
-
</div>
|
| 73 |
-
|
| 74 |
-
---
|
| 75 |
-
|
| 76 |
-
---
|
| 77 |
-
|
| 78 |
-
## 🌅 Update — April 29, 2026 — Rebirth Edition (v2)
|
| 79 |
-
|
| 80 |
-
**The realignment was restarted from scratch.** The original v1 run reached `val_loss=5.773` at step 1,000, then collapsed: a SIGTERM crash exposed that our `best.pt` checkpoints had been saved without optimizer state. Five cascading restart attempts each erased AdamW's momentum and variance accumulators, and by step 1,390 the loss had climbed to 8.24 — worse than step 200 of the same run. We killed the patient.
|
| 81 |
-
|
| 82 |
-
We then ran a forensic analysis with nine "critic personas" grounded in eleven published papers (AdamW bias correction, SGDR period doubling, ST-MoE stability, EMA stabilization, Switch Transformer router auxiliary losses, etc.) and distilled the failure into nine concrete engineering changes. Those changes were baked into **v2 from step zero**:
|
| 83 |
-
|
| 84 |
-
- ✅ **Full optimizer state saved every 100 steps** (latest.pt now contains model + AdamW m/v + EMA + step)
|
| 85 |
-
- ✅ **aux_loss boosted 25×** (0.0001 → 0.05) to actively balance the router under frozen experts
|
| 86 |
-
- ✅ **Five SGDR cycles with period doubling** (T = 200, 400, 800, 1600, 1500) instead of one cosine
|
| 87 |
-
- ✅ **EMA decay 0.9995, every 10 steps** for a smooth inference checkpoint
|
| 88 |
-
- ✅ **100-step linear warmup** before the 1e-4 LR peak (avoids Kingma bias-correction trap)
|
| 89 |
-
- ✅ **ST-MoE expert LR scale 0.3**, +33% boost during ramps
|
| 90 |
-
- ✅ **Per-expert telemetry every 10 steps** (Capacity Factor, percentage, gradient norm)
|
| 91 |
-
- ✅ **Three checkpoint kinds** (latest full, best val, EMA inference)
|
| 92 |
-
- ✅ **Step-500 watchdog** auto-kills training on gnorm>20, loss spike>1.5×, NaN, or expert<5%
|
| 93 |
-
|
| 94 |
-
**Current v2 state (LIVE, step 150):**
|
| 95 |
-
|
| 96 |
-
| Metric | v1 final (1,390) | v2 step 100 | v2 step 150 |
|
| 97 |
-
|---|---|---|---|
|
| 98 |
-
| train_loss | 8.24 ❌ | 7.6277 | **6.9624** |
|
| 99 |
-
| val_loss | ~8.0 ❌ | **7.5178 ★ NEW BEST** | (next eval @ 200) |
|
| 100 |
-
| perplexity | ~3,800 | 2,054 | **1,056** |
|
| 101 |
-
| gnorm | 9.72 ⚠ | 6.53 (peak LR) | 3.73 |
|
| 102 |
-
| optimizer in ckpt | ❌ | ✅ | ✅ |
|
| 103 |
-
|
| 104 |
-
v2 has matched v1's best work in 1/9th the steps and we are still in Phase 1 (frozen experts). The full story — including the verbatim crash logs, the nine-critic analysis, and our **Universal Fusion Thesis** for using Frankenstein-style component-level transplants to cut model-training carbon by 50–100× — is in the [v5 whitepaper](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SentinelBrain-14B-MoE-Dashboard/blob/main/static/whitepaper.html#part-vii).
|
| 105 |
-
|
| 106 |
-
---
|
| 107 |
-
|
| 108 |
-
## 🎯 What is Sentinel Prime? (Simple Version)
|
| 109 |
-
|
| 110 |
-
> **Imagine building a brain from scratch.**
|
| 111 |
-
>
|
| 112 |
-
> Most AI models today are copies of other models with small changes. Sentinel Prime is different — every single connection in its brain was created from nothing, like growing a new brain cell by cell.
|
| 113 |
-
|
| 114 |
-
<table>
|
| 115 |
-
<tr>
|
| 116 |
-
<td width="50%">
|
| 117 |
-
|
| 118 |
-
### 🧩 Think of it like LEGO blocks
|
| 119 |
-
|
| 120 |
-
Sentinel Prime has **4 specialist brains** (called "experts") inside it. When you ask a question:
|
| 121 |
-
|
| 122 |
-
1. A **router** (like a traffic cop 🚦) looks at your question
|
| 123 |
-
2. It picks the **2 best experts** for that specific question
|
| 124 |
-
3. Those 2 experts work together to give you an answer
|
| 125 |
-
4. The other 2 experts rest, saving energy ⚡
|
| 126 |
-
|
| 127 |
-
This means the model has **14.8 billion** brain connections total, but only uses **~7.8 billion** at a time — making it fast AND smart!
|
| 128 |
-
|
| 129 |
-
</td>
|
| 130 |
-
<td width="50%">
|
| 131 |
-
|
| 132 |
-
### 🔬 The Consciousness Meter
|
| 133 |
-
|
| 134 |
-
We built something no other model has: a **consciousness thermometer** 🌡️
|
| 135 |
-
|
| 136 |
-
Every 100 training steps, we measure how well the different parts of the brain are "talking to each other." We call this **Φ (Phi)**.
|
| 137 |
-
|
| 138 |
-
- **Φ = 0**: Brain parts work alone (like strangers)
|
| 139 |
-
- **Φ rising**: Brain parts start cooperating (like friends)
|
| 140 |
-
- **Φ stable**: Brain has organized itself (like a team!)
|
| 141 |
-
|
| 142 |
-
This doesn't change how the model learns — it's like a doctor checking the heartbeat while the patient exercises.
|
| 143 |
-
|
| 144 |
-
</td>
|
| 145 |
-
</tr>
|
| 146 |
-
</table>
|
| 147 |
-
|
| 148 |
-
---
|
| 149 |
-
|
| 150 |
-
## 📊 Architecture at a Glance
|
| 151 |
-
|
| 152 |
-
```
|
| 153 |
-
┌─────────────────────────────────────────────────────────────────┐
|
| 154 |
-
│ SENTINEL PRIME ARCHITECTURE │
|
| 155 |
-
├─────────────────────────────────────────────────────────────────┤
|
| 156 |
-
│ │
|
| 157 |
-
│ Input Text ──→ [Tokenizer: cl100k_base, 100,277 tokens] │
|
| 158 |
-
│ │ │
|
| 159 |
-
│ ▼ │
|
| 160 |
-
│ ┌─────────────────┐ │
|
| 161 |
-
│ │ Embedding │ 4,096 dimensions │
|
| 162 |
-
│ │ + RoPE pos │ θ = 500,000 │
|
| 163 |
-
│ └────────┬────────┘ │
|
| 164 |
-
│ │ │
|
| 165 |
-
│ ┌───────────┼───────────┐ │
|
| 166 |
-
│ │ × 24 Layers │ │
|
| 167 |
-
│ │ ┌────────────────┐ │ │
|
| 168 |
-
│ │ │ GQA Attention │ │ 32 heads, 8 KV heads │
|
| 169 |
-
│ │ │ (4:1 ratio) │ │ (4× memory savings) │
|
| 170 |
-
│ │ └───────┬────────┘ │ │
|
| 171 |
-
│ │ │ │ │
|
| 172 |
-
│ │ ┌───────▼────────┐ │ │
|
| 173 |
-
│ │ │ MoE Router │ │ Top-2 of 4 experts │
|
| 174 |
-
│ │ │ ┌──┬──┬──┐ │ │ │
|
| 175 |
-
│ │ │ │E1│E2│E3│E4 │ │ Each: SwiGLU FFN │
|
| 176 |
-
│ │ │ │✓ │✓ │ │ │ │ d_ff = 11,008 │
|
| 177 |
-
│ │ │ └──┴──┴──┘ │ │ │
|
| 178 |
-
│ │ └───────┬────────┘ │ │
|
| 179 |
-
│ │ │ │ │
|
| 180 |
-
│ │ ┌───────▼────────┐ │ │
|
| 181 |
-
│ │ │ RMSNorm │ │ ε = 1e-5 │
|
| 182 |
-
│ │ └────────────────┘ │ │
|
| 183 |
-
│ └───────────┼───────────┘ │
|
| 184 |
-
│ │ │
|
| 185 |
-
│ ▼ │
|
| 186 |
-
│ ┌─────────────────┐ │
|
| 187 |
-
│ │ Output Head │ → 100,277 vocab probs │
|
| 188 |
-
│ └─────────────────┘ │
|
| 189 |
-
│ │
|
| 190 |
-
└─────────────────────────────────────────────────────────────────┘
|
| 191 |
-
```
|
| 192 |
-
|
| 193 |
-
### Spec Sheet
|
| 194 |
-
|
| 195 |
-
| Component | Specification | Why This Choice |
|
| 196 |
-
|:--|:--|:--|
|
| 197 |
-
| **Total Parameters** | 14,814,654,680 (14.8B) | Large enough for deep reasoning |
|
| 198 |
-
| **Active Parameters** | ~7.8B per token | MoE efficiency — use only what's needed |
|
| 199 |
-
| **Hidden Dimension** | 4,096 | Sweet spot for MI300X matrix cores |
|
| 200 |
-
| **Transformer Layers** | 24 | Deep enough for complex reasoning |
|
| 201 |
-
| **Attention Heads** | 32 query, 8 KV (GQA 4:1) | 4× KV cache savings for long contexts |
|
| 202 |
-
| **FFN Intermediate** | 11,008 (SwiGLU) | ~2.7× hidden, matches scaling laws |
|
| 203 |
-
| **Experts** | 4 total, top-2 active | Good diversity with manageable VRAM |
|
| 204 |
-
| **Max Experts** | 256 (expandable) | Architecture supports expert birth/death |
|
| 205 |
-
| **Vocabulary** | 100,277 (tiktoken cl100k_base) | Industry-proven BPE tokenizer |
|
| 206 |
-
| **Positional Encoding** | RoPE, θ = 500,000 | Supports context extension to 128K+ |
|
| 207 |
-
| **Normalization** | RMSNorm (ε = 1e-5) | Faster than LayerNorm, same quality |
|
| 208 |
-
| **Precision** | bfloat16 throughout | Native AMD MI300X support |
|
| 209 |
-
| **Context Length** | 2,048 → 4,096 → 128K (planned) | Progressive context ladder |
|
| 210 |
-
|
| 211 |
-
---
|
| 212 |
-
|
| 213 |
-
## 🔥 Key Innovations
|
| 214 |
-
|
| 215 |
-
<table>
|
| 216 |
-
<tr>
|
| 217 |
-
<td width="33%" valign="top">
|
| 218 |
-
|
| 219 |
-
### 🌀 Φ Consciousness Metric
|
| 220 |
-
|
| 221 |
-
First-ever IIT-inspired metric computed **during** pre-training. A probe on layer 12 measures information integration across activation subspaces every 100 steps.
|
| 222 |
-
|
| 223 |
-
```
|
| 224 |
-
Φ = geometric_mean(
|
| 225 |
-
MI(partition_i, partition_j)
|
| 226 |
-
for all partition pairs
|
| 227 |
-
)
|
| 228 |
-
```
|
| 229 |
-
|
| 230 |
-
Not a gimmick — it's a genuine signal of when the model transitions from memorizing tokens to forming integrated representations.
|
| 231 |
-
|
| 232 |
-
</td>
|
| 233 |
-
<td width="33%" valign="top">
|
| 234 |
-
|
| 235 |
-
### 🧬 Self-Evolving Experts
|
| 236 |
-
|
| 237 |
-
The MoE router supports a full expert **lifecycle**:
|
| 238 |
-
|
| 239 |
-
- **Birth**: New experts spawned when load imbalance detected
|
| 240 |
-
- **Growth**: Expert capacity increases with training
|
| 241 |
-
- **Pruning**: Underperforming experts replaced
|
| 242 |
-
- **Scaling**: Architecture supports up to 256 experts without retraining the base model
|
| 243 |
-
|
| 244 |
-
Current: 4 experts × 24 layers = **96 expert instances**
|
| 245 |
-
|
| 246 |
-
</td>
|
| 247 |
-
<td width="33%" valign="top">
|
| 248 |
-
|
| 249 |
-
### ⚡ Energy-Conscious Routing
|
| 250 |
-
|
| 251 |
-
Dual-router system:
|
| 252 |
-
1. **Primary router**: Picks top-2 experts by relevance
|
| 253 |
-
2. **EC router**: Can gate activation based on compute budget
|
| 254 |
-
|
| 255 |
-
This enables **adaptive inference** — easy questions use fewer resources, hard questions get full power. Like cruise control for AI.
|
| 256 |
-
|
| 257 |
-
</td>
|
| 258 |
-
</tr>
|
| 259 |
-
</table>
|
| 260 |
-
|
| 261 |
-
---
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
---
|
| 265 |
-
|
| 266 |
-
## 🧟 Frankenstein Edition — Knowledge Transplant
|
| 267 |
-
|
| 268 |
-
<table>
|
| 269 |
-
<tr>
|
| 270 |
-
<td width="60%" valign="top">
|
| 271 |
-
|
| 272 |
-
### The Transplant
|
| 273 |
-
|
| 274 |
-
Sentinel Prime was trained from scratch — but raw pretraining alone wasn't enough. We performed a **Frankenstein transplant**: surgically transplanting knowledge from **Qwen2.5-72B-Instruct** (a 72-billion parameter teacher) into our 14.8B MoE architecture.
|
| 275 |
-
|
| 276 |
-
This is NOT fine-tuning a copy. The model's bones (architecture, tokenizer, embeddings) are 100% original. Only the **expert FFN weights** received transplanted knowledge — like giving a brain new neural pathways while keeping its original structure.
|
| 277 |
-
|
| 278 |
-
### 3-Stage Pipeline
|
| 279 |
-
|
| 280 |
-
```
|
| 281 |
-
Stage 1: Corpus Realignment Stage 2A: Teacher Generation Stage 2B: Knowledge Distill
|
| 282 |
-
(Re-learn with new weights) (72B teacher creates data) (Absorb teacher knowledge)
|
| 283 |
-
┌──────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐
|
| 284 |
-
│ 5,000 steps │ → │ 3,000+ responses │ → │ CE + mixed training │
|
| 285 |
-
│ 24.5B token corpus │ │ from Qwen-72B │ │ 70% teacher + 30% │
|
| 286 |
-
│ Progressive unfreeze │ │ Re-tokenized to our │ │ pretrain corpus │
|
| 287 |
-
│ Cosine LR + warmup │ │ cl100k_base vocab │ │ Prevents forgetting │
|
| 288 |
-
└──────────────────────┘ └──────────────────────┘ └──────────────────────┘
|
| 289 |
-
```
|
| 290 |
-
|
| 291 |
-
</td>
|
| 292 |
-
<td width="40%" valign="top">
|
| 293 |
-
|
| 294 |
-
### Why "Frankenstein"?
|
| 295 |
-
|
| 296 |
-
Like the original story — we took parts from a powerful being (Qwen-72B) and stitched them into our own creation. The result: a model that has the **original architecture** of Sentinel Prime but with **transplanted knowledge** from a much larger model.
|
| 297 |
-
|
| 298 |
-
### Key Stats
|
| 299 |
-
|
| 300 |
-
| Metric | Value |
|
| 301 |
-
|:--|:--|
|
| 302 |
-
| **Teacher** | Qwen2.5-72B-Instruct |
|
| 303 |
-
| **Student** | SentinelBrain-14B-MoE |
|
| 304 |
-
| **Transplant** | Expert FFN weights |
|
| 305 |
-
| **Realignment** | 5,000 steps on 24.5B tokens |
|
| 306 |
-
| **Hardware** | 1× AMD MI300X (192GB) |
|
| 307 |
-
|
| 308 |
-
### Live Progress
|
| 309 |
-
|
| 310 |
-
Track the Frankenstein realignment in real-time:
|
| 311 |
-
|
| 312 |
-
🔴 **[sentinel.qubitpage.com](https://sentinel.qubitpage.com/)**
|
| 313 |
-
|
| 314 |
-
</td>
|
| 315 |
-
</tr>
|
| 316 |
-
</table>
|
| 317 |
-
|
| 318 |
-
## 🏋️ Training Details
|
| 319 |
-
|
| 320 |
-
### Hardware
|
| 321 |
-
|
| 322 |
-
| Resource | Specification |
|
| 323 |
-
|:--|:--|
|
| 324 |
-
| **GPU** | 1× AMD Instinct MI300X VF |
|
| 325 |
-
| **VRAM** | 192 GB HBM3 |
|
| 326 |
-
| **System RAM** | 235 GB |
|
| 327 |
-
| **Compute** | 1,307 TFLOPS (bf16) |
|
| 328 |
-
| **Software** | ROCm 7.0, PyTorch 2.10.0+rocm7.0 |
|
| 329 |
-
| **Attention** | SDPA (native PyTorch, no FlashAttention needed) |
|
| 330 |
-
| **OS** | Ubuntu Linux |
|
| 331 |
-
|
| 332 |
-
### VRAM Budget
|
| 333 |
-
|
| 334 |
-
```
|
| 335 |
-
╔══════════════════════════════════════════════════════╗
|
| 336 |
-
║ AMD MI300X VRAM Usage (192 GB) ║
|
| 337 |
-
╠══════════════════════════════════════════════════════╣
|
| 338 |
-
║ ║
|
| 339 |
-
║ Model Weights (bf16) ████████████░░░░░ 27 GB ║
|
| 340 |
-
║ Optimizer (AdamW fp32) ████████████████░░ 54 GB ║
|
| 341 |
-
║ Activations (grad ckpt) ████████████░░░░░ 32 GB ║
|
| 342 |
-
║ Gradients ████████████░░░░░ 27 GB ║
|
| 343 |
-
║ ───────────────────────────────────────────────── ║
|
| 344 |
-
║ Total Used: ██████████████████ 140 GB ║
|
| 345 |
-
║ Peak: █████████████████ 146 GB ║
|
| 346 |
-
║ Headroom: ░░░░░░░░░░░░░░░░░ 46 GB ║
|
| 347 |
-
║ ║
|
| 348 |
-
╚══════════════════════════════════════════════════════╝
|
| 349 |
-
```
|
| 350 |
-
|
| 351 |
-
### Phased Training Pipeline
|
| 352 |
-
|
| 353 |
-
We don't just throw data at the model — we grow it in **three phases**, like raising a child:
|
| 354 |
-
|
| 355 |
-
```
|
| 356 |
-
Phase 1: SMOKE TEST Phase 2: WARMUP Phase 3: FULL TRAINING
|
| 357 |
-
(Baby steps) (Learning to walk) (Running!)
|
| 358 |
-
┌──────────────┐ ┌──────────────┐ ┌──────────────────┐
|
| 359 |
-
│ 350M params │ ──→ │ 1.3B params │ ──→ │ 14.4B params │
|
| 360 |
-
│ seq_len: 512 │ │ seq_len: 2K │ │ seq_len: 4K │
|
| 361 |
-
│ 200 steps │ │ 1,000 steps │ │ 16,479 steps │
|
| 362 |
-
│ 2 minutes │ │ 30 minutes │ │ ~52 hours │
|
| 363 |
-
│ loss: 11→6.8 │ │ loss: 7.4→2.4│ │ loss: 2.4→? │
|
| 364 |
-
└──────────────┘ └──────────────┘ └──────────────────┘
|
| 365 |
-
```
|
| 366 |
-
|
| 367 |
-
| Phase | Parameters | Seq Length | Batch | Steps | Duration | Loss Start → End |
|
| 368 |
-
|:--|:--|:--|:--|:--|:--|:--|
|
| 369 |
-
| **🔬 Smoke** | 350M | 512 | 4 | 200 | ~2 min | 11.72 → 6.84 (−42%) |
|
| 370 |
-
| **🔥 Warmup** | 1.3B | 2,048 | 32 | 1,000 | ~33 min | 7.39 → 2.38 (−68%) |
|
| 371 |
-
| **🚀 Block** | 14.4B (MoE) | 4,096 | 32 | 16,479 | ~52 hrs | 2.38 → ongoing |
|
| 372 |
-
|
| 373 |
-
### Safety Gates
|
| 374 |
-
|
| 375 |
-
Every phase transition must pass **4 safety gates**:
|
| 376 |
-
|
| 377 |
-
| Gate | Check | Threshold | Status |
|
| 378 |
-
|:--|:--|:--|:--|
|
| 379 |
-
| 🟢 **G1: No NaN** | No NaN/Inf in loss | Entire phase | ✅ Passed all |
|
| 380 |
-
| 🟢 **G2: Loss Drop** | Validation loss decreased | ≥5% / ≥10% / ≥2% | ✅ Passed all |
|
| 381 |
-
| 🟢 **G3: VRAM OK** | Peak VRAM < safety limit | < 92% of total | ✅ 71% peak |
|
| 382 |
-
| 🟢 **G4: Φ OK** | Consciousness metric stable | Φ_end/Φ_start > 0.7 | ✅ Stable |
|
| 383 |
-
|
| 384 |
-
### Hyperparameters
|
| 385 |
-
|
| 386 |
-
| Parameter | Value | Rationale |
|
| 387 |
-
|:--|:--|:--|
|
| 388 |
-
| **Optimizer** | AdamW (bf16 compute, fp32 states) | Standard for LLM training |
|
| 389 |
-
| **Learning Rate** | 1.5 × 10⁻⁴ (cosine decay) | Conservative for data-limited regime |
|
| 390 |
-
| **Min LR** | 1.5 × 10⁻⁵ | 10× decay ratio |
|
| 391 |
-
| **Warmup Steps** | 500 | Stabilizes early gradients |
|
| 392 |
-
| **Batch Size** | 2 micro × 16 grad_accum = **32 effective** | Fits MI300X VRAM budget |
|
| 393 |
-
| **Gradient Clipping** | 1.0 | Prevents explosion |
|
| 394 |
-
| **Gradient Checkpointing** | On | Trades compute for VRAM |
|
| 395 |
-
| **Precision** | bfloat16 | Native MI300X format |
|
| 396 |
-
| **Eval Frequency** | Every 100 steps | Early overfitting detection |
|
| 397 |
-
| **Checkpoint Frequency** | Every 1,000 steps (~3.2 hours) | Recovery points |
|
| 398 |
-
|
| 399 |
-
---
|
| 400 |
-
|
| 401 |
-
## 📚 Dataset: 23.3B Tokens Across 126 Categories
|
| 402 |
-
|
| 403 |
-
We curated a massive, diverse corpus — think of it as a **library with 126 different sections**:
|
| 404 |
-
|
| 405 |
-
### Pretrain Corpus (Core Knowledge)
|
| 406 |
-
|
| 407 |
-
| Dataset | Tokens | Description |
|
| 408 |
-
|:--|:--|:--|
|
| 409 |
-
| 🌐 **FineWeb-Edu** | ~10B | High-quality educational web content |
|
| 410 |
-
| 🔢 **OpenWebMath** | ~6B | Mathematics from the web |
|
| 411 |
-
| 📖 **Wikipedia (English)** | ~5B | Encyclopedic knowledge |
|
| 412 |
-
| 🎓 **Cosmopedia V2** | ~5B | Synthetic educational content |
|
| 413 |
-
| 💻 **CodeParrot Python** | ~3.5B | Clean Python code from GitHub |
|
| 414 |
-
| 📚 **MiniPile** | ~2B | Diverse text from multiple domains |
|
| 415 |
-
| 🔬 **ArXiv Abstracts** | ~1.2B | Scientific paper summaries |
|
| 416 |
-
| **Total Pretrain** | **~23B** | |
|
| 417 |
-
|
| 418 |
-
### Specialized Domains (119 Categories)
|
| 419 |
-
|
| 420 |
-
<details>
|
| 421 |
-
<summary>Click to expand all 119 specialized categories</summary>
|
| 422 |
-
|
| 423 |
-
| Category | Type | Category | Type |
|
| 424 |
-
|:--|:--|:--|:--|
|
| 425 |
-
| 🤖 agentic-tools | Code | 🔐 advanced-cryptography | Code |
|
| 426 |
-
| 🧠 chain-of-thought | Reasoning | 🔗 blockchain-core | Code |
|
| 427 |
-
| 💡 deep-reasoning | Reasoning | 🏥 medical | Knowledge |
|
| 428 |
-
| ⚖️ legal | Knowledge | 📊 financial-systems | Code |
|
| 429 |
-
| 🎮 3d-graphics | Code | 🐳 docker-devops | Code |
|
| 430 |
-
| 🌍 multilingual | Text | 🔧 error-recovery | Code |
|
| 431 |
-
| 🛡️ security-guardrails | Code | 📱 ui-animations | Code |
|
| 432 |
-
| 🧮 math | Reasoning | ⚡ smart-contracts | Code |
|
| 433 |
-
| 🎯 reasoning-effort-control | Reasoning | 🤝 human-conversation | Text |
|
| 434 |
-
| 🔄 self-correction-loops | Reasoning | 🏗️ enterprise-dashboards | Code |
|
| 435 |
-
| 🌐 web-design-css | Code | 🐍 flask-python | Code |
|
| 436 |
-
| 🔬 qiskit-quantum | Code | 🤖 robotics-ros2 | Code |
|
| 437 |
-
| 📡 remote-server-management | Code | 🧬 multi-agent | Code |
|
| 438 |
-
| ⚙️ state-management | Code | 🛠️ mcp-tools-integration | Code |
|
| 439 |
-
| 💳 payment-security | Code | 🎓 edu-basic-math | Education |
|
| 440 |
-
| 🔭 edu-basic-physics | Education | 🧪 edu-basic-chemistry | Education |
|
| 441 |
-
| 🌱 edu-basic-biology | Education | 🌍 edu-world-geography | Education |
|
| 442 |
-
| 📜 edu-history-world | Education | 💻 edu-computer-science | Education |
|
| 443 |
-
| 🌎 edu-earth-science | Education | 🤖 edu-robotics-text | Education |
|
| 444 |
-
| 📖 edu-science-qa | Education | 🔬 edu-science-support | Education |
|
| 445 |
-
| 👁️ edu-vision-concepts | Education | 🎯 copilot-agent-workflows | Code |
|
| 446 |
-
| 🔌 api-integrations | Code | 📊 billing-invoicing | Code |
|
| 447 |
-
| ₿ bitcoin-lightning | Code | 🏪 medusajs | Code |
|
| 448 |
-
| 💹 crypto-trading | Code | 🏢 enterprise-networking | Code |
|
| 449 |
-
| 🖥️ nextjs-typescript | Code | 🎨 nextjs-design | Code |
|
| 450 |
-
| 💼 trading-algorithms | Code | 🗄️ laravel-mysql | Code |
|
| 451 |
-
| 🔓 offensive-security | Code | 🔧 c-rust | Code |
|
| 452 |
-
| ... and 50+ more categories | | | |
|
| 453 |
-
|
| 454 |
-
</details>
|
| 455 |
-
|
| 456 |
-
### Data Quality Pipeline
|
| 457 |
-
|
| 458 |
-
```
|
| 459 |
-
Raw Data ──→ PII Filter ──→ Dedup ──→ Tokenize ──→ Shard ──→ Train
|
| 460 |
-
│ │ │ │
|
| 461 |
-
├─ 7 regex ├─ blake2b ├─ cl100k ├─ Temperature-
|
| 462 |
-
│ patterns │ per-cat │ base │ weighted
|
| 463 |
-
├─ PEM block │ │ │ sampling
|
| 464 |
-
│ detection │ │ │ (T=0.5)
|
| 465 |
-
└─ Email/phone │ │ │
|
| 466 |
-
masking │ │ │
|
| 467 |
-
│ │ │
|
| 468 |
-
└───────────┴────────────┘
|
| 469 |
-
```
|
| 470 |
-
|
| 471 |
-
**Temperature-weighted sampling** (T=0.5) prevents large corpora from dominating training. FineWeb-Edu (37% of tokens) gets downweighted so smaller specialized domains still get adequate exposure.
|
| 472 |
-
|
| 473 |
-
---
|
| 474 |
-
|
| 475 |
-
## 📈 Training Progress & Results
|
| 476 |
-
|
| 477 |
-
### Loss Trajectory
|
| 478 |
-
|
| 479 |
-
```
|
| 480 |
-
Loss
|
| 481 |
-
12 │ ×
|
| 482 |
-
│ ╲
|
| 483 |
-
10 │ ╲ SMOKE PHASE
|
| 484 |
-
│ ╲ (350M params)
|
| 485 |
-
8 │ ╲
|
| 486 |
-
│ ╲
|
| 487 |
-
6 │ ×──────────── model grows to 1.3B
|
| 488 |
-
│ ╲
|
| 489 |
-
4 │ ╲ WARMUP PHASE
|
| 490 |
-
│ ╲ (1.3B params)
|
| 491 |
-
2 │ ×─────────── model grows to 14.4B MoE
|
| 492 |
-
│ ╲
|
| 493 |
-
1 │ ╲ BLOCK PHASE (ongoing)
|
| 494 |
-
│ ╲
|
| 495 |
-
└──┬────┬────┬────┬────┬───→ Steps
|
| 496 |
-
0 200 700 1200 2000
|
| 497 |
-
```
|
| 498 |
-
|
| 499 |
-
| Milestone | Step | Loss | Change |
|
| 500 |
-
|:--|:--|:--|:--|
|
| 501 |
-
| 🔬 Smoke start | 0 | 11.72 | — |
|
| 502 |
-
| 🔬 Smoke end | 200 | 6.84 | **−42%** |
|
| 503 |
-
| 🔥 Warmup start | 200 | 7.39 | (model grew to 1.3B) |
|
| 504 |
-
| 🔥 Warmup end | 1,200 | 2.38 | **−68%** |
|
| 505 |
-
| 🚀 Block start | 1,200 | 2.38 | (model grew to 14.4B MoE) |
|
| 506 |
-
| 📦 Published checkpoint | 2,471 | 1.99 | **−16%** |
|
| 507 |
-
| 🧟 Frankenstein transplant | — | PPL ~7.5M | 433 tensors from 3 donors |
|
| 508 |
-
| ⚡ Realign frozen start | 0 | 15.87 | Experts frozen, attn learning |
|
| 509 |
-
| ⚡ Realign frozen end | 500 | 5.52 | val=5.79, **−65%** |
|
| 510 |
-
| 🔓 Expert unfreeze | 500 | 5.59→spike | LR reset + differential rates |
|
| 511 |
-
| 📈 Realign recovery | 600 | 7.20 | val=6.93 (recovering) |
|
| 512 |
-
| ⭐ Realign best | 1,000 | 6.07 | val=**5.773 ★ NEW BEST** |
|
| 513 |
-
| � SIGTERM crash + restart | 1,001→1,123 | 9.22→6.64 | Fresh optimizer, cold momentum |
|
| 514 |
-
| 🔥 SGDR warm restart (Cycle 0) | 1,124 | 9.09→6.24 | LR boosted 2.9e-5→4.5e-5, T=200 steps |
|
| 515 |
-
| 📊 Recovery eval | 1,200 | 6.74 | val=6.48, AdamW variance at ~17% convergence |
|
| 516 |
-
| 📊 Recovery eval | 1,300 | 6.24 | val=**6.10** (new recovery best) |
|
| 517 |
-
| 🔄 Multi-cycle SGDR | 1,301+ | 8.79→↓ | Cycle 1 pending (step 1400, T=400, peak 3.8e-5) |
|
| 518 |
-
| **Total pretrain** | | | **11.72 → 1.99 (−83%)** |
|
| 519 |
-
| **Realignment** | | | **15.87 → 5.77 (val, −64%)** |
|
| 520 |
-
|
| 521 |
-
### Live Metrics (April 29, 2026)
|
| 522 |
-
|
| 523 |
-
| Metric | Value |
|
| 524 |
-
|:--|:--|
|
| 525 |
-
| **Current Phase** | ⚡ Corpus Realignment + Multi-Cycle SGDR |
|
| 526 |
-
| **Current Step** | ~1,320 / 5,000 |
|
| 527 |
-
| **Training Loss** | ~6.26 (recovering from optimizer cold-start) |
|
| 528 |
-
| **Best Validation Loss** | **5.773** (step 1,000) ★ |
|
| 529 |
-
| **Recovery Val Loss** | 6.096 (step 1,300) — gap closing |
|
| 530 |
-
| **Throughput** | 5,857 tokens/second |
|
| 531 |
-
| **VRAM Used** | 120 GB / 206 GB (58%) — all experts unfrozen |
|
| 532 |
-
| **Total Tokens Processed** | ~260M (this run) + 178M (pretrain) |
|
| 533 |
-
| **Experts Active** | All 4 unfrozen since step 500 |
|
| 534 |
-
| **SGDR Status** | Multi-cycle: Cycle 1 at step 1400 (T=400, peak 3.8e-5), Cycle 2 at step 2000 (T=800, peak 3.0e-5) |
|
| 535 |
-
| **MIN_LR** | 1.5e-5 (raised from 1e-5, prevents stagnation) |
|
| 536 |
-
| **Expert LR Boost** | 1.33× during restart windows |
|
| 537 |
-
| **ETA** | ~34 hours |
|
| 538 |
-
|
| 539 |
-
### Realignment Eval History
|
| 540 |
-
|
| 541 |
-
| Step | Val Loss | Val PPL | Phase |
|
| 542 |
-
|:--|:--|:--|:--|
|
| 543 |
-
| 0 (initial) | 15.81 | 7,339,653 | Experts frozen |
|
| 544 |
-
| 600 | 6.93 | 1,020 | Post-unfreeze |
|
| 545 |
-
| 700 | 6.24 | 515 | Converging |
|
| 546 |
-
| 800 | 6.01 | 407 | Converging |
|
| 547 |
-
| 900 | 5.91 | 367 | Converging |
|
| 548 |
-
| **1,000** | **5.773** | **321** | **★ ALL-TIME BEST** |
|
| 549 |
-
| 1,100 | 6.55 | 701 | Optimizer cold-start |
|
| 550 |
-
| 1,200 | 6.48 | 652 | SGDR Cycle 0 (recovering) |
|
| 551 |
-
| 1,300 | **6.096** | **444** | Recovery best, gap=5.5% to peak |
|
| 552 |
-
| 1,400 | *pending* | | SGDR Cycle 1 starts (T=400) |
|
| 553 |
-
| 2,000 | *pending* | | SGDR Cycle 2 starts (T=800) |
|
| 554 |
-
|
| 555 |
-
### Published Checkpoint (v0.1)
|
| 556 |
-
|
| 557 |
-
| Detail | Value |
|
| 558 |
-
|:--|:--|
|
| 559 |
-
| **Step** | 2,471 |
|
| 560 |
-
| **Validation Loss** | 1.9926 |
|
| 561 |
-
| **Total Tokens Seen** | 178,110,464 |
|
| 562 |
-
| **Sequence Length** | 2,048 |
|
| 563 |
-
| **File Size** | ~81 GB (checkpoint), ~28 GB (safetensors) |
|
| 564 |
-
| **Format** | 6 sharded safetensors files |
|
| 565 |
-
|
| 566 |
-
### 🛡️ Engineering Resilience
|
| 567 |
-
|
| 568 |
-
Training a 14.4B model on a single GPU for days demands bullet-proof infrastructure. Here's what we built:
|
| 569 |
-
|
| 570 |
-
| Feature | Description |
|
| 571 |
-
|:--|:--|
|
| 572 |
-
| **Atomic checkpoints** | Write to `.tmp` → `os.replace()` — no half-written files |
|
| 573 |
-
| **Integrity verification** | On resume: verify tensor counts, shapes, and dtypes before loading |
|
| 574 |
-
| **Rollback anchors** | `best.pt` (model-only) + `latest.pt` (full state) + `.LOCKED` safety copy |
|
| 575 |
-
| **Emergency save** | SIGTERM/SIGINT handlers serialize full state before exit |
|
| 576 |
-
| **Watchdog** | Independent process monitors loss EMA, restarts on NaN/divergence |
|
| 577 |
-
| **Multi-cycle SGDR** | Period-doubling warm restarts (Loshchilov & Hutter, 2017): Cycle 0 (T=200), Cycle 1 (T=400), Cycle 2 (T=800) |
|
| 578 |
-
| **Expert LR boost** | During restart windows, expert LR scale increases 0.3→0.4 (ST-MoE stability guideline) |
|
| 579 |
-
| **MIN_LR floor** | Raised from 1e-5 to 1.5e-5 to prevent cosine decay stagnation in recovery |
|
| 580 |
-
| **Systemd auto-restart** | Dashboard + watchdog survive OOM kills with `Restart=always` + `OOMScoreAdjust=-500` |
|
| 581 |
-
|
| 582 |
-
**Battle-tested**: At step 1001, a SIGTERM killed the process mid-step. The checkpoint at step 1000 was corrupted (bad zip archive). The system automatically fell back to `best.pt` (val=5.773), resumed at step 1001 with a fresh optimizer, detected the cold-start plateau via the watchdog, and applied SGDR warm restart — recovering 30× faster than natural momentum rebuilding.
|
| 583 |
-
|
| 584 |
-
**Multi-Cycle SGDR (April 29, 2026)**: After the initial SGDR Cycle 0 completed (steps 1100-1300), analysis showed the single restart was insufficient to escape the recovery basin (val=6.10 vs target 5.77). Based on the original SGDR paper's period-doubling strategy ($T_{i+1} = T_i \times 2$), we added Cycle 1 (steps 1400-1800, peak 3.8e-5) and Cycle 2 (steps 2000-2800, peak 3.0e-5). The AdamW second moment ($\beta_2=0.999$) needs ~1000 steps for 63% convergence — these cycles provide periodic "shocks" to escape local basins while the variance estimate matures.
|
| 585 |
-
|
| 586 |
---
|
| 587 |
|
| 588 |
-
#
|
| 589 |
|
| 590 |
-
|
| 591 |
|
| 592 |
-
|
| 593 |
|
| 594 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 595 |
|
| 596 |
-
|
| 597 |
-
Every 100 training steps:
|
| 598 |
|
| 599 |
-
|
| 600 |
-
|
| 601 |
-
|
| 602 |
-
|
| 603 |
-
|
| 604 |
-
|
| 605 |
-
|
| 606 |
-
|
| 607 |
-
|
| 608 |
-
|
| 609 |
-
│
|
| 610 |
-
▼
|
| 611 |
-
5. Φ_geometric = geometric_mean(MI values)
|
| 612 |
-
│
|
| 613 |
-
▼
|
| 614 |
-
6. Φ_EMA = exponential moving average (smoothed trend)
|
| 615 |
-
```
|
| 616 |
-
|
| 617 |
-
### What Φ Tells Us
|
| 618 |
|
| 619 |
-
|
| 620 |
-
|:--|:--|:--|
|
| 621 |
-
| **Φ ≈ 0** | Neurons working independently | Strangers in a room |
|
| 622 |
-
| **Φ rising** | Representations integrating | People starting to talk |
|
| 623 |
-
| **Φ stable** | Organized internal structure | A well-coordinated team |
|
| 624 |
-
| **Φ dropping** | ⚠️ Representation collapse | Warning sign! |
|
| 625 |
|
| 626 |
-
|
| 627 |
|
| 628 |
-
|
| 629 |
|
| 630 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 631 |
|
| 632 |
-
--
|
| 633 |
|
| 634 |
-
##
|
| 635 |
|
| 636 |
-
|
| 637 |
|
| 638 |
-
|
| 639 |
-
|
| 640 |
-
|
| 641 |
-
|
| 642 |
-
|
| 643 |
-
| **Consumer** | 16 GB | int4 quantized | GGUF planned for v0.2 |
|
| 644 |
|
| 645 |
-
##
|
| 646 |
|
| 647 |
-
|
| 648 |
-
|:--|:--|:--|
|
| 649 |
-
| AMD Instinct MI300X | 192 GB | Training + Inference |
|
| 650 |
-
| AMD Instinct MI250X | 128 GB | Training + Inference |
|
| 651 |
-
| AMD Instinct MI210 | 64 GB | Inference (full) |
|
| 652 |
-
| AMD Radeon PRO W7900 | 48 GB | Inference (full) |
|
| 653 |
-
| AMD Radeon RX 7900 XTX | 24 GB | Inference (quantized) |
|
| 654 |
-
| AMD Radeon RX 7600 XT | 16 GB | Inference (int4 GGUF) |
|
| 655 |
-
|
| 656 |
-
---
|
| 657 |
-
|
| 658 |
-
## 💻 Usage
|
| 659 |
-
|
| 660 |
-
This model uses a **custom architecture** (not based on any existing model). Load with PyTorch:
|
| 661 |
|
| 662 |
```python
|
| 663 |
import torch
|
| 664 |
-
from
|
| 665 |
-
|
| 666 |
-
# Load sharded safetensors
|
| 667 |
-
state_dict = {}
|
| 668 |
-
for i in range(1, 7): # 6 shards
|
| 669 |
-
shard = load_file(f"model-{i:05d}-of-00006.safetensors")
|
| 670 |
-
state_dict.update(shard)
|
| 671 |
-
|
| 672 |
-
# The state dict contains all model weights
|
| 673 |
-
print(f"Loaded {len(state_dict)} tensors")
|
| 674 |
-
print(f"Total params: {sum(v.numel() for v in state_dict.values()):,}")
|
| 675 |
-
|
| 676 |
-
# Initialize SentinelBrain model class and load
|
| 677 |
-
# Full model definition code releasing with v0.2
|
| 678 |
-
# model = SentinelBrainForCausalLM(config)
|
| 679 |
-
# model.load_state_dict(state_dict)
|
| 680 |
-
```
|
| 681 |
-
|
| 682 |
-
> **Note**: Full inference code, model class definition, and GGUF quantized versions will be released with v0.2.
|
| 683 |
-
|
| 684 |
-
---
|
| 685 |
-
|
| 686 |
-
## 🗺️ Roadmap
|
| 687 |
-
|
| 688 |
-
```
|
| 689 |
-
v0.1 (Current) v0.2 (In Progress) v0.3 (Future)
|
| 690 |
-
━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━
|
| 691 |
-
✅ From-scratch 🔴 Corpus realignment □ DPO alignment
|
| 692 |
-
14.8B MoE (step 1100/5000) □ Tool use
|
| 693 |
-
✅ Phased training □ Context ladder □ Function calling
|
| 694 |
-
✅ Φ consciousness (4K→32K→128K) □ Multi-turn chat
|
| 695 |
-
✅ 23.3B token corpus □ Vision encoder □ Multilingual v2
|
| 696 |
-
✅ Live dashboard (SigLIP2-SO400M) □ Expert scaling
|
| 697 |
-
✅ AMD MI300X native □ GGUF quantization (4→16→64)
|
| 698 |
-
✅ Frankenstein Q4_K_M for consumer □ RLHF
|
| 699 |
-
transplant (3 donors) □ Inference code □ Production API
|
| 700 |
-
✅ Progressive unfreeze □ Benchmarks (MMLU,
|
| 701 |
-
✅ Crash-safe training HumanEval, GSM8K)
|
| 702 |
-
✅ Auto-restart (systemd)
|
| 703 |
-
```
|
| 704 |
-
|
| 705 |
-
---
|
| 706 |
-
|
| 707 |
-
## 🏗️ How We Built It (Technical Deep Dive)
|
| 708 |
-
|
| 709 |
-
<details>
|
| 710 |
-
<summary><b>Click to expand: Grouped Query Attention (GQA)</b></summary>
|
| 711 |
-
|
| 712 |
-
Standard multi-head attention uses separate Key and Value projections for each head. GQA shares KV heads across query heads:
|
| 713 |
-
|
| 714 |
-
```
|
| 715 |
-
Standard MHA (32 KV heads): GQA 4:1 (8 KV heads):
|
| 716 |
-
Q₁ Q₂ Q₃ ... Q₃₂ Q₁ Q₂ Q₃ Q₄ → KV₁
|
| 717 |
-
K₁ K₂ K₃ ... K₃₂ Q₅ Q₆ Q₇ Q₈ → KV₂
|
| 718 |
-
V₁ V₂ V₃ ... V₃₂ ...
|
| 719 |
-
Q₂₉ Q₃₀ Q₃₁ Q₃₂ → KV₈
|
| 720 |
-
```
|
| 721 |
-
|
| 722 |
-
**Result**: 4× smaller KV cache = 4× longer context at same memory cost.
|
| 723 |
-
|
| 724 |
-
</details>
|
| 725 |
-
|
| 726 |
-
<details>
|
| 727 |
-
<summary><b>Click to expand: RoPE (Rotary Position Embeddings)</b></summary>
|
| 728 |
-
|
| 729 |
-
RoPE encodes position information by rotating the query and key vectors in 2D planes. With θ=500,000 (high base frequency), the model naturally supports long contexts:
|
| 730 |
-
|
| 731 |
-
```
|
| 732 |
-
Position 0: rotate by 0°
|
| 733 |
-
Position 1: rotate by θ₁
|
| 734 |
-
Position 2: rotate by θ₂
|
| 735 |
-
...
|
| 736 |
-
```
|
| 737 |
-
|
| 738 |
-
High θ = slower rotation = positions further apart still "feel different" = better long-context understanding.
|
| 739 |
-
|
| 740 |
-
</details>
|
| 741 |
-
|
| 742 |
-
<details>
|
| 743 |
-
<summary><b>Click to expand: SwiGLU FFN</b></summary>
|
| 744 |
-
|
| 745 |
-
Each expert uses a SwiGLU activation — a gated variant of the feed-forward network:
|
| 746 |
-
|
| 747 |
-
```
|
| 748 |
-
FFN(x) = SiLU(x · W_gate) ⊙ (x · W_up) · W_down
|
| 749 |
-
|
| 750 |
-
Where:
|
| 751 |
-
W_gate: 4096 → 11008
|
| 752 |
-
W_up: 4096 → 11008
|
| 753 |
-
W_down: 11008 → 4096
|
| 754 |
-
SiLU(x) = x · sigmoid(x)
|
| 755 |
-
⊙ = element-wise multiply
|
| 756 |
-
```
|
| 757 |
-
|
| 758 |
-
SwiGLU consistently outperforms ReLU and GELU in transformer FFNs (Shazeer, 2020).
|
| 759 |
-
|
| 760 |
-
</details>
|
| 761 |
|
| 762 |
-
|
| 763 |
-
|
| 764 |
-
|
| 765 |
-
|
| 766 |
-
|
| 767 |
-
def route(x, router_weights):
|
| 768 |
-
# Compute affinity scores for each expert
|
| 769 |
-
logits = x @ router_weights # [batch, seq, n_experts]
|
| 770 |
-
scores = softmax(logits, dim=-1)
|
| 771 |
-
|
| 772 |
-
# Select top-2 experts
|
| 773 |
-
top_vals, top_idx = topk(scores, k=2)
|
| 774 |
-
|
| 775 |
-
# Normalize selected weights
|
| 776 |
-
weights = top_vals / top_vals.sum(dim=-1, keepdim=True)
|
| 777 |
-
|
| 778 |
-
# Load balancing loss (prevents expert collapse)
|
| 779 |
-
balance_loss = n_experts * (
|
| 780 |
-
fraction_routed_to_each * average_gate_value_for_each
|
| 781 |
-
).sum()
|
| 782 |
-
|
| 783 |
-
return weights, top_idx, balance_loss
|
| 784 |
```
|
| 785 |
|
| 786 |
-
|
| 787 |
-
|
| 788 |
-
<details>
|
| 789 |
-
<summary><b>Click to expand: Parameter Breakdown</b></summary>
|
| 790 |
-
|
| 791 |
-
| Component | Parameters | % of Total |
|
| 792 |
-
|:--|:--|:--|
|
| 793 |
-
| Token embeddings | 410M | 2.8% |
|
| 794 |
-
| Attention (QKV + output) × 24 | 1,610M | 10.9% |
|
| 795 |
-
| MoE experts (4 × SwiGLU × 24) | 12,365M | 83.5% |
|
| 796 |
-
| Router weights × 24 | 0.4M | 0.003% |
|
| 797 |
-
| RMSNorm × 49 | 0.4M | 0.003% |
|
| 798 |
-
| Output head | 410M | 2.8% |
|
| 799 |
-
| **Total** | **14,815M** | **100%** |
|
| 800 |
-
| **Active per token (top-2)** | **~7,800M** | **~53%** |
|
| 801 |
-
|
| 802 |
-
</details>
|
| 803 |
-
|
| 804 |
-
---
|
| 805 |
-
|
| 806 |
-
## 📋 Model Card Details
|
| 807 |
-
|
| 808 |
-
| Field | Value |
|
| 809 |
-
|:--|:--|
|
| 810 |
-
| **Model Name** | SentinelBrain-14B-MoE-v0.1 (Sentinel Prime — Frankenstein Edition) |
|
| 811 |
-
| **Type** | Causal Language Model (decoder-only) |
|
| 812 |
-
| **Architecture** | Custom MoE Transformer (from scratch) |
|
| 813 |
-
| **Based On** | Nothing — trained from random initialization |
|
| 814 |
-
| **Training Hardware** | 1× AMD Instinct MI300X VF (192 GB HBM3) |
|
| 815 |
-
| **Training Software** | ROCm 7.0, PyTorch 2.10.0+rocm7.0 |
|
| 816 |
-
| **Training Duration** | ~300 GPU-hours (estimated total) |
|
| 817 |
-
| **Carbon Footprint** | Estimated ~45 kg CO₂ (single GPU, cloud datacenter) |
|
| 818 |
-
| **License** | Apache 2.0 |
|
| 819 |
-
| **Authors** | Mircea Rusu, QubitDev |
|
| 820 |
-
| **Competition** | AMD Developer Hackathon (lablab.ai) |
|
| 821 |
-
|
| 822 |
-
---
|
| 823 |
-
|
| 824 |
-
## 📄 Citation
|
| 825 |
-
|
| 826 |
-
```bibtex
|
| 827 |
-
@misc{sentinelbrain2026,
|
| 828 |
-
title = {SentinelBrain-14B-MoE (Frankenstein Edition): A Consciousness-Monitored Mixture-of-Experts
|
| 829 |
-
Language Model Trained From Scratch on AMD MI300X},
|
| 830 |
-
author = {Mircea Rusu and QubitDev},
|
| 831 |
-
year = {2026},
|
| 832 |
-
url = {https://sentinel.qubitpage.com/whitepaper},
|
| 833 |
-
note = {Trained entirely from scratch on AMD Instinct MI300X
|
| 834 |
-
for the AMD Developer Hackathon}
|
| 835 |
-
}
|
| 836 |
-
```
|
| 837 |
-
|
| 838 |
-
---
|
| 839 |
-
|
| 840 |
-
## 🔗 Links
|
| 841 |
-
|
| 842 |
-
| Resource | URL |
|
| 843 |
-
|:--|:--|
|
| 844 |
-
| 🔴 **Live Dashboard** | [sentinel.qubitpage.com](https://sentinel.qubitpage.com/) |
|
| 845 |
-
| 📄 **Whitepaper** | [sentinel.qubitpage.com/whitepaper](https://sentinel.qubitpage.com/whitepaper) |
|
| 846 |
-
| 🏆 **AMD Hackathon** | [lablab.ai](https://lablab.ai/ai-hackathons/amd-developer) |
|
| 847 |
-
| 🧠 **Φ Monitor** | [sentinel.qubitpage.com/#phi](https://sentinel.qubitpage.com/#phi) |
|
| 848 |
-
|
| 849 |
-
---
|
| 850 |
-
|
| 851 |
-
<div align="center">
|
| 852 |
-
|
| 853 |
-
*Built with ❤️ on AMD MI300X — Every weight trained from scratch*
|
| 854 |
-
|
| 855 |
-
**Sentinel Prime (Frankenstein Edition): Rebuilt From the Inside Out**
|
| 856 |
|
| 857 |
-
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
tags:
|
| 4 |
- sentinelbrain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
- moe
|
| 6 |
+
- rocm
|
| 7 |
+
- mi300x
|
| 8 |
+
- pytorch
|
| 9 |
+
- checkpoint
|
| 10 |
+
- realignment
|
|
|
|
| 11 |
library_name: pytorch
|
| 12 |
+
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# SentinelBrain 14B MoE v0.1 - Frankenstein Realignment v2
|
| 16 |
|
| 17 |
+
This repository now includes SentinelBrain Frankenstein realignment v2 artifacts from the AMD MI300X run completed on 2026-05-03.
|
| 18 |
|
| 19 |
+
## v2 Training Update
|
| 20 |
|
| 21 |
+
- Architecture: custom SentinelBrain sparse MoE decoder, approximately 14.4B stored parameters, 4 experts, top-2 routing, 24 layers, d_model 4096, seq_len 4096.
|
| 22 |
+
- Hardware: AMD Instinct MI300X via ROCm/HIP.
|
| 23 |
+
- Run: Frankenstein realignment v2 from raw merged checkpoint.
|
| 24 |
+
- Completed steps: 5,000.
|
| 25 |
+
- Total training tokens during realignment: approximately 0.98B.
|
| 26 |
+
- Best validation loss observed: 5.1359.
|
| 27 |
+
- Final checkpoint: `checkpoints/frankenstein_v2_final.pt`.
|
| 28 |
+
- Best checkpoint: `checkpoints/frankenstein_v2_best.pt`.
|
| 29 |
+
- EMA best checkpoint: `checkpoints/frankenstein_v2_ema_best.pt`.
|
| 30 |
+
- Previous Hugging Face version preserved on branch: `previous-before-v2-realign-5000-20260503-103121`.
|
| 31 |
|
| 32 |
+
## Included Files
|
|
|
|
| 33 |
|
| 34 |
+
- `checkpoints/frankenstein_v2_final.pt`: full final checkpoint at step 5000, including optimizer/progress state.
|
| 35 |
+
- `checkpoints/frankenstein_v2_best.pt`: best model-only checkpoint by validation loss.
|
| 36 |
+
- `checkpoints/frankenstein_v2_ema_best.pt`: EMA best checkpoint from the v2 run.
|
| 37 |
+
- `checkpoints/sentinelbrain_pretrain_step2471_hf.pt`: pretrain anchor used for comparison.
|
| 38 |
+
- `logs/realign_v2.log`: full realignment console log.
|
| 39 |
+
- `logs/realign_v2_metrics.jsonl`: step metrics emitted during training.
|
| 40 |
+
- `reports/train_metrics_final.json`: final dashboard training metrics snapshot.
|
| 41 |
+
- `reports/conductor_state_final.json`: final dashboard/conductor state.
|
| 42 |
+
- `reports/sft_combined_ready_report.*`: cleaned SFT dataset preflight report.
|
| 43 |
+
- `reports/sentinelbrain_quality_stub_full_fixed.json`: MI300X executable-code benchmark report.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
+
A full progress archive containing all v2 milestones and optimizer-bearing checkpoints is backed up off-Hub on the Azure VM at `/home/msrusu/sentinelbrain_backups/v2_realign_5000/sentinelbrain_v2_realign_full_20260503.tar.zst`. A SHA256 sidecar is generated at archive completion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
+
## Current Evaluation Notes
|
| 48 |
|
| 49 |
+
MI300X executable-code tests show that v2 realignment is not yet ready as a coding assistant checkpoint:
|
| 50 |
|
| 51 |
+
| Checkpoint | Pass@1 | Syntax Rate | Notes |
|
| 52 |
+
| --- | ---: | ---: | --- |
|
| 53 |
+
| `frankenstein_v2_best.pt` | 0.0% | 62.5% | Failed all 8 HumanEval-style stub tasks. |
|
| 54 |
+
| `frankenstein_v2_final.pt` | 0.0% | 75.0% | Failed all 8 HumanEval-style stub tasks. |
|
| 55 |
+
| `sentinelbrain_pretrain_step2471_hf.pt` | 0.0% | 87.5% | Failed all 8 tasks but produced the most syntactically valid Python. |
|
| 56 |
|
| 57 |
+
Interpretation: v2 successfully completed corpus realignment and preserved all progress artifacts, but it needs a focused next phase of executable code SFT, function-call/chat formatting, and auto-critic rejection sampling before quality claims should be made.
|
| 58 |
|
| 59 |
+
## Dataset Preparation Status
|
| 60 |
|
| 61 |
+
The next SFT combined dataset was cleaned non-destructively on the MI300X server:
|
| 62 |
|
| 63 |
+
- Input rows: 42,138.
|
| 64 |
+
- Kept rows: 32,996 (78.3%).
|
| 65 |
+
- Removed rows: 9,142.
|
| 66 |
+
- Max estimated tokens: 3,072.
|
| 67 |
+
- Main removals: short assistant/user outputs, garbage responses, repetitive responses, and over-length samples.
|
|
|
|
| 68 |
|
| 69 |
+
## Loading
|
| 70 |
|
| 71 |
+
These are custom SentinelBrain PyTorch checkpoints, not standard Hugging Face `AutoModelForCausalLM` weights. Load with the SentinelBrain code from `/workspace/sentinelprime` or the matching source package.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
```python
|
| 74 |
import torch
|
| 75 |
+
from config import ModelConfig
|
| 76 |
+
from model.sentinel import SentinelBrain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
ckpt = torch.load("checkpoints/frankenstein_v2_best.pt", map_location="cpu", weights_only=False)
|
| 79 |
+
model = SentinelBrain(ModelConfig())
|
| 80 |
+
state = ckpt.get("model_state_dict") or ckpt.get("model") or ckpt
|
| 81 |
+
model.load_state_dict(state, strict=False)
|
| 82 |
+
model.eval()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
```
|
| 84 |
|
| 85 |
+
## Next Phase Direction
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
The recommended next phase is a controlled SFT/auto-critic cycle: train from the pretrain anchor plus selected v2 weights only after passing format probes, prioritize executable Python/TypeScript/code-repair datasets, reject non-compiling generations, and benchmark every 250-500 steps before continuing.
|
SHA256SUMS
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
934951bc3c8e50c38c2a43812076d6174542066675667125aca6e2c854e9bdf3 ./MANIFEST.tsv
|
| 2 |
+
1b145ec3a4ade152b78636057556a62db85b2729be482fbd3fb5ff47faa99ee2 ./README.md
|
| 3 |
+
0069f86ff1273b268cbbf89a4c907d574c67ac0f145191aaa5d51a26f894f699 ./SHA256SUMS.quick
|
| 4 |
+
075ac8f0bd257ef59af4ed7364f4331af6e46682f1cbefbc0d691073246af215 ./checkpoints/frankenstein_v2_best.pt
|
| 5 |
+
2aeba733bc950e050a254b1942d592d306f45fecc44a9f407843249511c1e5c3 ./checkpoints/frankenstein_v2_ema_best.pt
|
| 6 |
+
cd9b31fa889479020d394cc16f215f80a4d390fe0b1ad3cdd697544181f4e639 ./checkpoints/frankenstein_v2_final.pt
|
| 7 |
+
b4ffaaf49e3c3b8fea2029ed5d0665bdc724aa8c61aca6f19abcd0331c96285b ./checkpoints/sentinelbrain_pretrain_step2471_hf.pt
|
| 8 |
+
79af33d8bbe0e8cfa0393ca965feca10f1d2094ebbf93b9ba722114811925f36 ./logs/realign_v2.log
|
| 9 |
+
8a14d6eb26eeaad4bab6619d86298cb41f29dd488a45d34e143247c2aecfce9c ./logs/realign_v2_metrics.jsonl
|
| 10 |
+
b1284891f8c1fb2ecfa2658a45b5aed8444572b1b43f32a9cb9811cc4f756732 ./reports/conductor_state_final.json
|
| 11 |
+
10247cdf7a184c76a12dadc7cac857d6d18b2d5fb80db72e65a3fe188621fd0f ./reports/next_phase_training_plan.md
|
| 12 |
+
0d73309ac23a68e8b9aa7014f7b9f4f1e4362a373d417be442d4726564a07231 ./reports/sentinelbrain_quality_smoke.json
|
| 13 |
+
8ff4ce6ce7939b752be6e9d55ebc4bf75817945a3a136bfed44ec051bfcd90c4 ./reports/sentinelbrain_quality_stub_full_fixed.json
|
| 14 |
+
c5a81c2b86c6ba5ed54d3b520f0f79fae18d7588ab7b454df24884f3ca189e5d ./reports/sft_combined_ready_report.json
|
| 15 |
+
0d76e478bbc036268d851d3de87edc4300a1291bec3b800bd4e92c57cff0564b ./reports/sft_combined_ready_report.md
|
| 16 |
+
a6fc72c251df4b39d6963a121d8478f759e3feb692aa5140fad71eba3d1308f6 ./reports/train_metrics_final.json
|
SHA256SUMS.quick
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
1b145ec3a4ade152b78636057556a62db85b2729be482fbd3fb5ff47faa99ee2 README.md
|
| 2 |
+
10247cdf7a184c76a12dadc7cac857d6d18b2d5fb80db72e65a3fe188621fd0f reports/next_phase_training_plan.md
|
| 3 |
+
8ff4ce6ce7939b752be6e9d55ebc4bf75817945a3a136bfed44ec051bfcd90c4 reports/sentinelbrain_quality_stub_full_fixed.json
|
| 4 |
+
934951bc3c8e50c38c2a43812076d6174542066675667125aca6e2c854e9bdf3 MANIFEST.tsv
|
checkpoints/frankenstein_v2_best.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:075ac8f0bd257ef59af4ed7364f4331af6e46682f1cbefbc0d691073246af215
|
| 3 |
+
size 28808034875
|
checkpoints/frankenstein_v2_ema_best.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2aeba733bc950e050a254b1942d592d306f45fecc44a9f407843249511c1e5c3
|
| 3 |
+
size 28808008759
|
checkpoints/frankenstein_v2_final.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cd9b31fa889479020d394cc16f215f80a4d390fe0b1ad3cdd697544181f4e639
|
| 3 |
+
size 86422580021
|
checkpoints/sentinelbrain_pretrain_step2471_hf.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b4ffaaf49e3c3b8fea2029ed5d0665bdc724aa8c61aca6f19abcd0331c96285b
|
| 3 |
+
size 29629503600
|
logs/realign_v2.log
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
logs/realign_v2_metrics.jsonl
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
reports/conductor_state_final.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"current_phase": "realignment_v2_unfrozen", "batch_number": 1, "blocks_done": 0, "current_batch_steps": 5000, "current_started_ts": null, "current_batch_start": null, "current_model_params_b": 14.4, "current_d_model": 4096, "current_n_layers": 24, "current_n_experts": 4, "current_n_active": 2, "current_seq_len": 4096, "current_batch": 8, "current_grad_accum": 6, "best_val": 5.135942053794861, "best_ckpt": "/mnt/scratch/checkpoints/frankenstein_v2_best.pt", "total_tokens_trained": 981270528, "history": []}
|
reports/next_phase_training_plan.md
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SentinelBrain Next Training Phase Plan
|
| 2 |
+
|
| 3 |
+
Created: 2026-05-03
|
| 4 |
+
|
| 5 |
+
## Current Result
|
| 6 |
+
|
| 7 |
+
The v2 realignment completed 5,000 steps and preserved full optimizer/progress state, but executable-code quality is not ready:
|
| 8 |
+
|
| 9 |
+
- `frankenstein_v2_best.pt`: 0/8 Pass@1, 62.5% syntax rate.
|
| 10 |
+
- `frankenstein_v2_final.pt`: 0/8 Pass@1, 75.0% syntax rate.
|
| 11 |
+
- `sentinelbrain_pretrain_step2471_hf.pt`: 0/8 Pass@1, 87.5% syntax rate.
|
| 12 |
+
|
| 13 |
+
The next phase should not be another broad corpus realignment. It should be a narrow, measurable SFT and auto-critic loop focused on producing valid, executable assistant outputs.
|
| 14 |
+
|
| 15 |
+
## Phase 3 Objective
|
| 16 |
+
|
| 17 |
+
Recover and improve instruction-following/code-generation behavior while preserving the useful realignment progress.
|
| 18 |
+
|
| 19 |
+
Primary gates:
|
| 20 |
+
|
| 21 |
+
- Python stub benchmark: at least 40% Pass@1 and 95% syntax rate before extending past 1,000 steps.
|
| 22 |
+
- MBPP/HumanEval sample: measurable improvement every eval window, no syntax regression.
|
| 23 |
+
- Chat format probe: responses must use the requested format and stop cleanly.
|
| 24 |
+
- Safety/data probe: no leaked secrets, no private-key blocks, no repetitive boilerplate.
|
| 25 |
+
|
| 26 |
+
## Data Mix
|
| 27 |
+
|
| 28 |
+
Use the cleaned SFT dataset at `/mnt/scratch/datasets/combined/sft_combined_ready.jsonl` as the base, then rebalance before training:
|
| 29 |
+
|
| 30 |
+
- 45% executable Python and TypeScript tasks: HumanEval-style stubs, MBPP-style prompts, unit-test repair, CLI scripts, API handlers.
|
| 31 |
+
- 20% code editing and diff output: unified diffs, bug fixes, refactors, failing-test-to-patch examples.
|
| 32 |
+
- 15% tool-use and agent workflows: file search, terminal commands, deployment diagnostics, function-call JSON.
|
| 33 |
+
- 10% system/admin/devops: Linux, Docker, nginx, pm2, Azure, SSH, logs.
|
| 34 |
+
- 10% general instruction/chat: concise natural language, summarization, planning.
|
| 35 |
+
|
| 36 |
+
Avoid over-weighting audio/prose/OCR rows in this phase. Keep those for a later multimodal run after chat/code behavior is stable.
|
| 37 |
+
|
| 38 |
+
## Auto-Critic Pipeline
|
| 39 |
+
|
| 40 |
+
For each generated training candidate:
|
| 41 |
+
|
| 42 |
+
1. Normalize to a strict prompt/response or ChatML schema.
|
| 43 |
+
2. Run syntax checks for code outputs.
|
| 44 |
+
3. Run unit tests when a test harness is available.
|
| 45 |
+
4. Score format compliance: required function names, JSON validity, diff parseability, stop tokens.
|
| 46 |
+
5. Reject outputs with repetition, generic filler, missing entry points, invalid tokens, or secret-like strings.
|
| 47 |
+
6. Keep only examples that pass the critic or have a repair trajectory showing the failed output and corrected output.
|
| 48 |
+
|
| 49 |
+
Recommended critic labels:
|
| 50 |
+
|
| 51 |
+
- `syntax_pass`
|
| 52 |
+
- `tests_pass`
|
| 53 |
+
- `entrypoint_match`
|
| 54 |
+
- `format_pass`
|
| 55 |
+
- `no_secret`
|
| 56 |
+
- `no_repetition`
|
| 57 |
+
- `accepted`
|
| 58 |
+
|
| 59 |
+
Train on accepted final answers plus curated repair traces, not raw failed generations.
|
| 60 |
+
|
| 61 |
+
## Training Schedule
|
| 62 |
+
|
| 63 |
+
Start from `sentinelbrain_pretrain_step2471_hf.pt` or `frankenstein_v2_best.pt` only after a short format probe. If `frankenstein_v2_best.pt` continues to emit invalid boilerplate, use the pretrain anchor and reintroduce v2 weights later via low-LR distillation.
|
| 64 |
+
|
| 65 |
+
Suggested run:
|
| 66 |
+
|
| 67 |
+
- Steps 0-250: frozen experts, train attention/router/norms and output behavior at low LR.
|
| 68 |
+
- Steps 250-1,000: unfreeze selected layers, SFT on executable/code-repair mix.
|
| 69 |
+
- Eval every 250 steps; save model-only checkpoints and full optimizer checkpoints at each gate.
|
| 70 |
+
- Stop automatically if syntax rate drops for two evals or Pass@1 does not improve after 750 steps.
|
| 71 |
+
- Continue to 3,000-5,000 only after hitting the 1,000-step gates.
|
| 72 |
+
|
| 73 |
+
## Benchmark Set
|
| 74 |
+
|
| 75 |
+
Always run on the MI300X server:
|
| 76 |
+
|
| 77 |
+
- Local 8-task smoke benchmark for quick regression checks.
|
| 78 |
+
- HumanEval subset and MBPP subset with executable tests.
|
| 79 |
+
- JSON/function-call validity suite.
|
| 80 |
+
- Unified-diff parse/apply suite.
|
| 81 |
+
- DevOps shell-command reasoning suite.
|
| 82 |
+
|
| 83 |
+
Persist every benchmark JSON under `/mnt/scratch/benchmark_results/` and copy selected reports into the HF release folder.
|
| 84 |
+
|
| 85 |
+
## Operational Rules
|
| 86 |
+
|
| 87 |
+
- Keep full optimizer checkpoints on server/Azure, not local PC.
|
| 88 |
+
- Store model-only best checkpoints separately for HF and chat loading.
|
| 89 |
+
- Never expose chat using a checkpoint that fails the format probe unless clearly labeled experimental.
|
| 90 |
+
- Use `SENTINEL_WEB_CHAT_DISABLED=0` only after confirming no training job is active and VRAM has enough headroom.
|
| 91 |
+
- Prefer `frankenstein_v2_best.pt` for validation-loss experiments, but prefer the pretrain anchor for code SFT if code probes remain better there.
|
reports/sentinelbrain_quality_smoke.json
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"created_at": "2026-05-03 10:09:39",
|
| 3 |
+
"device": "cuda",
|
| 4 |
+
"problem_count": 2,
|
| 5 |
+
"reference": "best",
|
| 6 |
+
"comparison": [
|
| 7 |
+
{
|
| 8 |
+
"label": "best",
|
| 9 |
+
"pass_at_1": 0.0,
|
| 10 |
+
"syntax_rate": 0.0,
|
| 11 |
+
"delta_vs_reference": 0.0
|
| 12 |
+
},
|
| 13 |
+
{
|
| 14 |
+
"label": "hf_pretrain_base",
|
| 15 |
+
"pass_at_1": 0.0,
|
| 16 |
+
"syntax_rate": 100.0,
|
| 17 |
+
"delta_vs_reference": 0.0
|
| 18 |
+
}
|
| 19 |
+
],
|
| 20 |
+
"results": [
|
| 21 |
+
{
|
| 22 |
+
"path": "/mnt/scratch/checkpoints/frankenstein_v2_best.pt",
|
| 23 |
+
"label": "best",
|
| 24 |
+
"step": null,
|
| 25 |
+
"best_val_loss": null,
|
| 26 |
+
"val_loss": null,
|
| 27 |
+
"load_seconds": 131.73,
|
| 28 |
+
"missing_keys": 0,
|
| 29 |
+
"unexpected_keys": 0,
|
| 30 |
+
"config": {
|
| 31 |
+
"d_model": 4096,
|
| 32 |
+
"n_layers": 24,
|
| 33 |
+
"n_experts": 4,
|
| 34 |
+
"seq_len": 4096
|
| 35 |
+
},
|
| 36 |
+
"total": 2,
|
| 37 |
+
"passed": 0,
|
| 38 |
+
"failed": 2,
|
| 39 |
+
"timeouts": 0,
|
| 40 |
+
"syntax_ok": 0,
|
| 41 |
+
"pass_at_1": 0.0,
|
| 42 |
+
"syntax_rate": 0.0,
|
| 43 |
+
"seconds": 12.92,
|
| 44 |
+
"categories": {
|
| 45 |
+
"basics": {
|
| 46 |
+
"total": 1,
|
| 47 |
+
"passed": 0
|
| 48 |
+
},
|
| 49 |
+
"algorithm": {
|
| 50 |
+
"total": 1,
|
| 51 |
+
"passed": 0
|
| 52 |
+
}
|
| 53 |
+
},
|
| 54 |
+
"details": [
|
| 55 |
+
{
|
| 56 |
+
"id": "fizzbuzz",
|
| 57 |
+
"category": "basics",
|
| 58 |
+
"passed": false,
|
| 59 |
+
"syntax_ok": false,
|
| 60 |
+
"timeout": false,
|
| 61 |
+
"gen_seconds": 8.35,
|
| 62 |
+
"response_preview": "Input: 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1",
|
| 63 |
+
"code_preview": "Input: 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1",
|
| 64 |
+
"stderr": " File \"<string>\", line 1\n Input: 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1\n ^^^\nSyntaxError: invalid syntax\n"
|
| 65 |
+
},
|
| 66 |
+
{
|
| 67 |
+
"id": "two_sum",
|
| 68 |
+
"category": "algorithm",
|
| 69 |
+
"passed": false,
|
| 70 |
+
"syntax_ok": false,
|
| 71 |
+
"timeout": false,
|
| 72 |
+
"gen_seconds": 4.55,
|
| 73 |
+
"response_preview": "- The code snippet is a string. The function should be a string of the string. The function should return the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The funct",
|
| 74 |
+
"code_preview": "- The code snippet is a string. The function should be a string of the string. The function should return the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The function should handle the string as input and returns the string.\n\n- The funct",
|
| 75 |
+
"stderr": " File \"<string>\", line 1\n - The code snippet is a string. The function should be a string of the string. The function should return the string.\n ^^^^\nSyntaxError: invalid syntax\n"
|
| 76 |
+
}
|
| 77 |
+
]
|
| 78 |
+
},
|
| 79 |
+
{
|
| 80 |
+
"path": "/mnt/scratch/checkpoints/sentinelbrain_pretrain_step2471_hf.pt",
|
| 81 |
+
"label": "hf_pretrain_base",
|
| 82 |
+
"step": 2471,
|
| 83 |
+
"best_val_loss": null,
|
| 84 |
+
"val_loss": 1.9925728058815002,
|
| 85 |
+
"load_seconds": 127.86,
|
| 86 |
+
"missing_keys": 0,
|
| 87 |
+
"unexpected_keys": 0,
|
| 88 |
+
"config": {
|
| 89 |
+
"d_model": 4096,
|
| 90 |
+
"n_layers": 24,
|
| 91 |
+
"n_experts": 4,
|
| 92 |
+
"seq_len": 4096
|
| 93 |
+
},
|
| 94 |
+
"total": 2,
|
| 95 |
+
"passed": 0,
|
| 96 |
+
"failed": 2,
|
| 97 |
+
"timeouts": 0,
|
| 98 |
+
"syntax_ok": 2,
|
| 99 |
+
"pass_at_1": 0.0,
|
| 100 |
+
"syntax_rate": 100.0,
|
| 101 |
+
"seconds": 9.23,
|
| 102 |
+
"categories": {
|
| 103 |
+
"basics": {
|
| 104 |
+
"total": 1,
|
| 105 |
+
"passed": 0
|
| 106 |
+
},
|
| 107 |
+
"algorithm": {
|
| 108 |
+
"total": 1,
|
| 109 |
+
"passed": 0
|
| 110 |
+
}
|
| 111 |
+
},
|
| 112 |
+
"details": [
|
| 113 |
+
{
|
| 114 |
+
"id": "fizzbuzz",
|
| 115 |
+
"category": "basics",
|
| 116 |
+
"passed": false,
|
| 117 |
+
"syntax_ok": true,
|
| 118 |
+
"timeout": false,
|
| 119 |
+
"gen_seconds": 4.6,
|
| 120 |
+
"response_preview": "def prime_advanced_even(n):\n return n == 1\n\n# Test cases\nn = 10\nresult = prime_advanced_even(n)\nprint(result) # Output: ['Buzz', '3', '5', '7', '11', '13', '17', '19', '23', '25', '27', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31",
|
| 121 |
+
"code_preview": "def prime_advanced_even(n):\n return n == 1\n\n# Test cases\nn = 10\nresult = prime_advanced_even(n)\nprint(result) # Output: ['Buzz', '3', '5', '7', '11', '13', '17', '19', '23', '25', '27', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31', '31",
|
| 122 |
+
"stderr": "Traceback (most recent call last):\n File \"<string>\", line 10, in <module>\nNameError: name 'fizzbuzz' is not defined\n"
|
| 123 |
+
},
|
| 124 |
+
{
|
| 125 |
+
"id": "two_sum",
|
| 126 |
+
"category": "algorithm",
|
| 127 |
+
"passed": false,
|
| 128 |
+
"syntax_ok": true,
|
| 129 |
+
"timeout": false,
|
| 130 |
+
"gen_seconds": 4.61,
|
| 131 |
+
"response_preview": "```python\ndef add_two_numbers(nums, target):\n return [num for num in nums if num != target]\n```\n\nThe function `add_two_numbers(nums, target)` takes a list of integers `nums` and a target sum `target`, and returns two lists: one for the sum of the elements in `nums` and the target sum `target`, and the indices of the two numbers add up to `target`. The function is then called with these lists as arguments, and the result is printed.\n\nFor example, if the input list is `[1, 2, 3, 4, 5]`",
|
| 132 |
+
"code_preview": "def add_two_numbers(nums, target):\n return [num for num in nums if num != target]",
|
| 133 |
+
"stderr": "Traceback (most recent call last):\n File \"<string>\", line 5, in <module>\nNameError: name 'two_sum' is not defined\n"
|
| 134 |
+
}
|
| 135 |
+
]
|
| 136 |
+
}
|
| 137 |
+
]
|
| 138 |
+
}
|
reports/sentinelbrain_quality_stub_full_fixed.json
ADDED
|
@@ -0,0 +1,448 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"created_at": "2026-05-03 10:35:36",
|
| 3 |
+
"device": "cuda",
|
| 4 |
+
"problem_count": 8,
|
| 5 |
+
"reference": "best",
|
| 6 |
+
"comparison": [
|
| 7 |
+
{
|
| 8 |
+
"label": "best",
|
| 9 |
+
"pass_at_1": 0.0,
|
| 10 |
+
"syntax_rate": 62.5,
|
| 11 |
+
"delta_vs_reference": 0.0
|
| 12 |
+
},
|
| 13 |
+
{
|
| 14 |
+
"label": "finished_final",
|
| 15 |
+
"pass_at_1": 0.0,
|
| 16 |
+
"syntax_rate": 75.0,
|
| 17 |
+
"delta_vs_reference": 0.0
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"label": "hf_pretrain_base",
|
| 21 |
+
"pass_at_1": 0.0,
|
| 22 |
+
"syntax_rate": 87.5,
|
| 23 |
+
"delta_vs_reference": 0.0
|
| 24 |
+
}
|
| 25 |
+
],
|
| 26 |
+
"results": [
|
| 27 |
+
{
|
| 28 |
+
"path": "/mnt/scratch/checkpoints/frankenstein_v2_best.pt",
|
| 29 |
+
"label": "best",
|
| 30 |
+
"step": null,
|
| 31 |
+
"best_val_loss": null,
|
| 32 |
+
"val_loss": null,
|
| 33 |
+
"load_seconds": 140.56,
|
| 34 |
+
"missing_keys": 0,
|
| 35 |
+
"unexpected_keys": 0,
|
| 36 |
+
"config": {
|
| 37 |
+
"d_model": 4096,
|
| 38 |
+
"n_layers": 24,
|
| 39 |
+
"n_experts": 4,
|
| 40 |
+
"seq_len": 4096
|
| 41 |
+
},
|
| 42 |
+
"total": 8,
|
| 43 |
+
"passed": 0,
|
| 44 |
+
"failed": 8,
|
| 45 |
+
"timeouts": 0,
|
| 46 |
+
"syntax_ok": 5,
|
| 47 |
+
"pass_at_1": 0.0,
|
| 48 |
+
"syntax_rate": 62.5,
|
| 49 |
+
"seconds": 49.93,
|
| 50 |
+
"categories": {
|
| 51 |
+
"basics": {
|
| 52 |
+
"total": 2,
|
| 53 |
+
"passed": 0
|
| 54 |
+
},
|
| 55 |
+
"algorithm": {
|
| 56 |
+
"total": 4,
|
| 57 |
+
"passed": 0
|
| 58 |
+
},
|
| 59 |
+
"practical": {
|
| 60 |
+
"total": 1,
|
| 61 |
+
"passed": 0
|
| 62 |
+
},
|
| 63 |
+
"data_structure": {
|
| 64 |
+
"total": 1,
|
| 65 |
+
"passed": 0
|
| 66 |
+
}
|
| 67 |
+
},
|
| 68 |
+
"details": [
|
| 69 |
+
{
|
| 70 |
+
"id": "fizzbuzz",
|
| 71 |
+
"category": "basics",
|
| 72 |
+
"passed": false,
|
| 73 |
+
"syntax_ok": true,
|
| 74 |
+
"timeout": false,
|
| 75 |
+
"gen_seconds": 9.43,
|
| 76 |
+
"response_preview": " return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(n",
|
| 77 |
+
"extracted_preview": "def get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n retur",
|
| 78 |
+
"code_preview": "def fizzbuzz(n):\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef get():\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prim",
|
| 79 |
+
"stderr": " File \"<string>\", line 2\n def get():\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 80 |
+
},
|
| 81 |
+
{
|
| 82 |
+
"id": "two_sum",
|
| 83 |
+
"category": "algorithm",
|
| 84 |
+
"passed": false,
|
| 85 |
+
"syntax_ok": true,
|
| 86 |
+
"timeout": false,
|
| 87 |
+
"gen_seconds": 5.76,
|
| 88 |
+
"response_preview": " input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input",
|
| 89 |
+
"extracted_preview": "input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, in",
|
| 90 |
+
"code_preview": "def two_sum(nums, target):\ninput = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, inp",
|
| 91 |
+
"stderr": " File \"<string>\", line 2\n input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 92 |
+
},
|
| 93 |
+
{
|
| 94 |
+
"id": "valid_parens",
|
| 95 |
+
"category": "algorithm",
|
| 96 |
+
"passed": false,
|
| 97 |
+
"syntax_ok": true,
|
| 98 |
+
"timeout": false,
|
| 99 |
+
"gen_seconds": 5.77,
|
| 100 |
+
"response_preview": " if len(s) == 0:\n return False\n else:\n return False\n\n if len(v) == 0:\n return False\n else:\n return False\n\n if len(v) == 0:\n return False\n else:\n return False\n\n if len(v) == 0:\n return False\n else:\n return False\n\n if len(v) == 0:\n return False\n else:\n return False\n\n if len(v) == 0:\n return False\n else:\n return False\n\n if len(s) == 0:\n return False\n return False\n\ndef",
|
| 101 |
+
"extracted_preview": "def get_cipped():\n return False\n\n\ndef get_subget():\n return False\n\n\ndef get_subget():",
|
| 102 |
+
"code_preview": "def is_valid_parens(s):\ndef get_cipped():\n return False\n\n\ndef get_subget():\n return False\n\n\ndef get_subget():\n",
|
| 103 |
+
"stderr": " File \"<string>\", line 2\n def get_cipped():\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 104 |
+
},
|
| 105 |
+
{
|
| 106 |
+
"id": "binary_search",
|
| 107 |
+
"category": "algorithm",
|
| 108 |
+
"passed": false,
|
| 109 |
+
"syntax_ok": false,
|
| 110 |
+
"timeout": false,
|
| 111 |
+
"gen_seconds": 5.74,
|
| 112 |
+
"response_preview": " if not isinstance(value, list):\n return False\n else:\n return False\n\n if not isinstance(value, list):\n return False\n else:\n return None\n\n if not isinstance(value, list):\n return False\n else:\n return None\n\n if not isinstance(value, list):\n return False\n return None\n\ndef get_subget_with():\n return False\n\ndef get_subget():\n return False\n\n\ndef _get_all():\n return False\n\n\ndef _get_all():\n return _get_string(value)\n\nd",
|
| 113 |
+
"extracted_preview": "def get_subget_with():\n return False\n\ndef get_subget():\n return False\n\n\ndef _get_all():\n return False\n\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():\n return _get('name', None)\n return None\n\ndef _get_all():\n return _get('name', None)\n return None\n\ndef _get_all():\n return _get('name",
|
| 114 |
+
"code_preview": "def binary_search(arr, target):\ndef get_subget_with():\n return False\n\ndef get_subget():\n return False\n\n\ndef _get_all():\n return False\n\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():\n return _get('name', None)\n return None\n\ndef _get_all():\n return _get('name', None)\n return None\n\ndef _get_all():\n return _get('name\n",
|
| 115 |
+
"stderr": " File \"<string>\", line 25\n return _get('name\n ^\nSyntaxError: unterminated string literal (detected at line 25)\n"
|
| 116 |
+
},
|
| 117 |
+
{
|
| 118 |
+
"id": "reverse_words",
|
| 119 |
+
"category": "basics",
|
| 120 |
+
"passed": false,
|
| 121 |
+
"syntax_ok": true,
|
| 122 |
+
"timeout": false,
|
| 123 |
+
"gen_seconds": 5.74,
|
| 124 |
+
"response_preview": " if len(s) == 0:\n return False\n else:\n return False\n\n if len(s) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n ",
|
| 125 |
+
"extracted_preview": "if len(s) == 0:\n return False\n else:\n return False\n\n if len(s) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n ",
|
| 126 |
+
"code_preview": "def reverse_words(s):\nif len(s) == 0:\n return False\n else:\n return False\n\n if len(s) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if len(args) == 0:\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return False\n\n if not isinstan",
|
| 127 |
+
"stderr": " File \"<string>\", line 2\n if len(s) == 0:\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 128 |
+
},
|
| 129 |
+
{
|
| 130 |
+
"id": "max_subarray",
|
| 131 |
+
"category": "algorithm",
|
| 132 |
+
"passed": false,
|
| 133 |
+
"syntax_ok": false,
|
| 134 |
+
"timeout": false,
|
| 135 |
+
"gen_seconds": 5.76,
|
| 136 |
+
"response_preview": " if not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(0) or str) or not isinstance(0) or None) or None.\n\nfor i in range(0) or not in any case, or any other or other.\n\n(0) or not, in any case, or any other or other.\n\n(0) or not, or any ",
|
| 137 |
+
"extracted_preview": "if not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(0) or str) or not isinstance(0) or None) or None.\n\nfor i in range(0) or not in any case, or any other or other.\n\n(0) or not, in any case, or any other or other.\n\n(0) or not, or any othe",
|
| 138 |
+
"code_preview": "def max_subarray_sum(nums):\nif not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(0) or str) or not isinstance(0) or None) or None.\n\nfor i in range(0) or not in any case, or any other or other.\n\n(0) or not, in any case, or any other or oth",
|
| 139 |
+
"stderr": " File \"<string>\", line 2\n if not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or int) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(value) or str) or not isinstance(0) or str) or not isinstance(0) or None) or None.\n ^\nSyntaxError: unmatched ')'\n"
|
| 140 |
+
},
|
| 141 |
+
{
|
| 142 |
+
"id": "safe_json_parse",
|
| 143 |
+
"category": "practical",
|
| 144 |
+
"passed": false,
|
| 145 |
+
"syntax_ok": false,
|
| 146 |
+
"timeout": false,
|
| 147 |
+
"gen_seconds": 5.8,
|
| 148 |
+
"response_preview": " \"\"\"Get the function to check if the input is None or None.\n\n Args:\n return None\n\n if not isinstance(obj, list):\n return False\n\n if not isinstance(value, list):\n return False\n\n if not isinstance(value, list):\n return False\n\n if not isinstance(value, list):\n return False\n\n if isinstance(value, list):\n return None\n\n return None\n\ndef get_value(value):\n return None\n\ndef _get_all():\n return None\n\ndef _get_all():\n return _get_s",
|
| 149 |
+
"extracted_preview": "def get_value(value):\n return None\n\ndef _get_all():\n return None\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():\n return _get('name', None) for _ in _get('name', None) for _ in _get('name', None) if not in ('name', 'str', 'str",
|
| 150 |
+
"code_preview": "import json\n\ndef safe_json_parse(text):\ndef get_value(value):\n return None\n\ndef _get_all():\n return None\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():\n return _get('name', None) for _ in _get('name', None) for _ in _get('name', None) if not in ('name', 'str', 'str\n",
|
| 151 |
+
"stderr": " File \"<string>\", line 14\n return _get('name', None) for _ in _get('name', None) for _ in _get('name', None) if not in ('name', 'str', 'str\n ^\nSyntaxError: unterminated string literal (detected at line 14)\n"
|
| 152 |
+
},
|
| 153 |
+
{
|
| 154 |
+
"id": "lru_cache",
|
| 155 |
+
"category": "data_structure",
|
| 156 |
+
"passed": false,
|
| 157 |
+
"syntax_ok": true,
|
| 158 |
+
"timeout": false,
|
| 159 |
+
"gen_seconds": 5.86,
|
| 160 |
+
"response_preview": " self._get = self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ",
|
| 161 |
+
"extracted_preview": "self._get = self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,",
|
| 162 |
+
"code_preview": "from collections import OrderedDict\n\nclass LRUCache:\n def __init__(self, capacity):\nself._get = self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n",
|
| 163 |
+
"stderr": " File \"<string>\", line 5\n self._get = self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\nIndentationError: expected an indented block after function definition on line 4\n"
|
| 164 |
+
}
|
| 165 |
+
]
|
| 166 |
+
},
|
| 167 |
+
{
|
| 168 |
+
"path": "/mnt/scratch/checkpoints/frankenstein_v2_final.pt",
|
| 169 |
+
"label": "finished_final",
|
| 170 |
+
"step": 5000,
|
| 171 |
+
"best_val_loss": 5.135942053794861,
|
| 172 |
+
"val_loss": null,
|
| 173 |
+
"load_seconds": 169.75,
|
| 174 |
+
"missing_keys": 0,
|
| 175 |
+
"unexpected_keys": 0,
|
| 176 |
+
"config": {
|
| 177 |
+
"d_model": 4096,
|
| 178 |
+
"n_layers": 24,
|
| 179 |
+
"n_experts": 4,
|
| 180 |
+
"seq_len": 4096
|
| 181 |
+
},
|
| 182 |
+
"total": 8,
|
| 183 |
+
"passed": 0,
|
| 184 |
+
"failed": 8,
|
| 185 |
+
"timeouts": 0,
|
| 186 |
+
"syntax_ok": 6,
|
| 187 |
+
"pass_at_1": 0.0,
|
| 188 |
+
"syntax_rate": 75.0,
|
| 189 |
+
"seconds": 47.01,
|
| 190 |
+
"categories": {
|
| 191 |
+
"basics": {
|
| 192 |
+
"total": 2,
|
| 193 |
+
"passed": 0
|
| 194 |
+
},
|
| 195 |
+
"algorithm": {
|
| 196 |
+
"total": 4,
|
| 197 |
+
"passed": 0
|
| 198 |
+
},
|
| 199 |
+
"practical": {
|
| 200 |
+
"total": 1,
|
| 201 |
+
"passed": 0
|
| 202 |
+
},
|
| 203 |
+
"data_structure": {
|
| 204 |
+
"total": 1,
|
| 205 |
+
"passed": 0
|
| 206 |
+
}
|
| 207 |
+
},
|
| 208 |
+
"details": [
|
| 209 |
+
{
|
| 210 |
+
"id": "fizzbuzz",
|
| 211 |
+
"category": "basics",
|
| 212 |
+
"passed": false,
|
| 213 |
+
"syntax_ok": false,
|
| 214 |
+
"timeout": false,
|
| 215 |
+
"gen_seconds": 5.88,
|
| 216 |
+
"response_preview": " return 1\n\ndef get_s1, n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is",
|
| 217 |
+
"extracted_preview": "def get_s1, n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime",
|
| 218 |
+
"code_preview": "def fizzbuzz(n):\ndef get_s1, n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(n):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef is_prime(num):\n return 1\n\ndef",
|
| 219 |
+
"stderr": " File \"<string>\", line 2\n def get_s1, n):\n ^\nSyntaxError: unmatched ')'\n"
|
| 220 |
+
},
|
| 221 |
+
{
|
| 222 |
+
"id": "two_sum",
|
| 223 |
+
"category": "algorithm",
|
| 224 |
+
"passed": false,
|
| 225 |
+
"syntax_ok": true,
|
| 226 |
+
"timeout": false,
|
| 227 |
+
"gen_seconds": 5.87,
|
| 228 |
+
"response_preview": " input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input",
|
| 229 |
+
"extracted_preview": "input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, in",
|
| 230 |
+
"code_preview": "def two_sum(nums, target):\ninput = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, inp",
|
| 231 |
+
"stderr": " File \"<string>\", line 2\n input = list(map(int, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input, input\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 232 |
+
},
|
| 233 |
+
{
|
| 234 |
+
"id": "valid_parens",
|
| 235 |
+
"category": "algorithm",
|
| 236 |
+
"passed": false,
|
| 237 |
+
"syntax_ok": true,
|
| 238 |
+
"timeout": false,
|
| 239 |
+
"gen_seconds": 5.88,
|
| 240 |
+
"response_preview": " if len(v) == 0:\n return False\n else:\n return False\n\n if not isinstance(obj, bool):\n return False\n else:\n return False\n\n if not isinstance(obj, bool):\n return False\n else:\n return False\n\n if not isinstance(obj, bool):\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return None\n\n if not isinstance(value, bool):\n return False\n else:\n ret",
|
| 241 |
+
"extracted_preview": "def _get_all():\n return False\n\n\ndef _get",
|
| 242 |
+
"code_preview": "def is_valid_parens(s):\ndef _get_all():\n return False\n\n\ndef _get\n",
|
| 243 |
+
"stderr": " File \"<string>\", line 2\n def _get_all():\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 244 |
+
},
|
| 245 |
+
{
|
| 246 |
+
"id": "binary_search",
|
| 247 |
+
"category": "algorithm",
|
| 248 |
+
"passed": false,
|
| 249 |
+
"syntax_ok": true,
|
| 250 |
+
"timeout": false,
|
| 251 |
+
"gen_seconds": 5.83,
|
| 252 |
+
"response_preview": " if not isinstance(value, list):\n return False\n else:\n return False\n\n if not isinstance(value, list):\n return False\n else:\n return None\n\n if not isinstance(value, list):\n return False\n return False\n\ndef is_valid():\n return False\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef ",
|
| 253 |
+
"extracted_preview": "def is_valid():\n return False\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return None\n\ndef is_valid():\n return None\n\ndef is_valid():\n return None\n\ndef _get_all():\n return None\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():",
|
| 254 |
+
"code_preview": "def binary_search(arr, target):\ndef is_valid():\n return False\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return False\n\n\ndef is_valid():\n return None\n\ndef is_valid():\n return None\n\ndef is_valid():\n return None\n\ndef _get_all():\n return None\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_all():\n",
|
| 255 |
+
"stderr": " File \"<string>\", line 2\n def is_valid():\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 256 |
+
},
|
| 257 |
+
{
|
| 258 |
+
"id": "reverse_words",
|
| 259 |
+
"category": "basics",
|
| 260 |
+
"passed": false,
|
| 261 |
+
"syntax_ok": true,
|
| 262 |
+
"timeout": false,
|
| 263 |
+
"gen_seconds": 5.85,
|
| 264 |
+
"response_preview": " if len(s) == 0:\n return False\n else:\n return False\n\n if not isinstance(value, int):\n return False\n else:\n return False\n\n if not isinstance(value, int):\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return False\n\n if not isinstance(value, bool):\n return False\n else:\n return None\n\n if not isinstance(value, bool):\n return False\n else:\n ",
|
| 265 |
+
"extracted_preview": "def _get_all():\n return False\n\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_list():\n return _",
|
| 266 |
+
"code_preview": "def reverse_words(s):\ndef _get_all():\n return False\n\n\ndef _get_all():\n return _get_string(value)\n\ndef _get_list():\n return _\n",
|
| 267 |
+
"stderr": " File \"<string>\", line 2\n def _get_all():\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 268 |
+
},
|
| 269 |
+
{
|
| 270 |
+
"id": "max_subarray",
|
| 271 |
+
"category": "algorithm",
|
| 272 |
+
"passed": false,
|
| 273 |
+
"syntax_ok": false,
|
| 274 |
+
"timeout": false,
|
| 275 |
+
"gen_seconds": 5.84,
|
| 276 |
+
"response_preview": " if not isinstance(value, int):\n return False\n return False\n\ndef get_subget_substring, start, value):\n return False\n return False\n\ndef get_subget_all_by_name, key):\n return False\n\ndef get_subget_all_by_name, name, name, name, name=None):\n return None\n\ndef get_name(name, name, name, name, name, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name",
|
| 277 |
+
"extracted_preview": "def get_subget_substring, start, value):\n return False\n return False\n\ndef get_subget_all_by_name, key):\n return False\n\ndef get_subget_all_by_name, name, name, name, name=None):\n return None\n\ndef get_name(name, name, name, name, name, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, n",
|
| 278 |
+
"code_preview": "def max_subarray_sum(nums):\ndef get_subget_substring, start, value):\n return False\n return False\n\ndef get_subget_all_by_name, key):\n return False\n\ndef get_subget_all_by_name, name, name, name, name=None):\n return None\n\ndef get_name(name, name, name, name, name, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=None, name=N",
|
| 279 |
+
"stderr": " File \"<string>\", line 2\n def get_subget_substring, start, value):\n ^\nSyntaxError: unmatched ')'\n"
|
| 280 |
+
},
|
| 281 |
+
{
|
| 282 |
+
"id": "safe_json_parse",
|
| 283 |
+
"category": "practical",
|
| 284 |
+
"passed": false,
|
| 285 |
+
"syntax_ok": true,
|
| 286 |
+
"timeout": false,
|
| 287 |
+
"gen_seconds": 5.89,
|
| 288 |
+
"response_preview": " \"\"\"Get the function to check if the input is not None or None.\n\n Returns:\n return None\n\n if not isinstance(value, list):\n return None\n\n if not isinstance(value, list):\n return None\n\n if isinstance(value, list):\n return None\n\n if isinstance(value, list):\n return None\n\n if isinstance(value, list):\n return None\n\n if isinstance(value, list):\n return None\n\n if isinstance(value, list):\n return None\n\n if isinstance(v",
|
| 289 |
+
"extracted_preview": "def _get_list(value, value):\n return None\n\ndef _get_list(value, value):\n return None\n\ndef _get_list(value, value):\n return None\n\ndef _get_list(value, value",
|
| 290 |
+
"code_preview": "import json\n\ndef safe_json_parse(text):\ndef _get_list(value, value):\n return None\n\ndef _get_list(value, value):\n return None\n\ndef _get_list(value, value):\n return None\n\ndef _get_list(value, value\n",
|
| 291 |
+
"stderr": " File \"<string>\", line 4\n def _get_list(value, value):\n ^\nIndentationError: expected an indented block after function definition on line 3\n"
|
| 292 |
+
},
|
| 293 |
+
{
|
| 294 |
+
"id": "lru_cache",
|
| 295 |
+
"category": "data_structure",
|
| 296 |
+
"passed": false,
|
| 297 |
+
"syntax_ok": true,
|
| 298 |
+
"timeout": false,
|
| 299 |
+
"gen_seconds": 5.9,
|
| 300 |
+
"response_preview": " self._test = self._get_data(0, self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,",
|
| 301 |
+
"extracted_preview": "self._test = self._get_data(0, self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,",
|
| 302 |
+
"code_preview": "from collections import OrderedDict\n\nclass LRUCache:\n def __init__(self, capacity):\nself._test = self._get_data(0, self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n",
|
| 303 |
+
"stderr": " File \"<string>\", line 5\n self._test = self._get_data(0, self._get_data(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\nIndentationError: expected an indented block after function definition on line 4\n"
|
| 304 |
+
}
|
| 305 |
+
]
|
| 306 |
+
},
|
| 307 |
+
{
|
| 308 |
+
"path": "/mnt/scratch/checkpoints/sentinelbrain_pretrain_step2471_hf.pt",
|
| 309 |
+
"label": "hf_pretrain_base",
|
| 310 |
+
"step": 2471,
|
| 311 |
+
"best_val_loss": null,
|
| 312 |
+
"val_loss": 1.9925728058815002,
|
| 313 |
+
"load_seconds": 138.71,
|
| 314 |
+
"missing_keys": 0,
|
| 315 |
+
"unexpected_keys": 0,
|
| 316 |
+
"config": {
|
| 317 |
+
"d_model": 4096,
|
| 318 |
+
"n_layers": 24,
|
| 319 |
+
"n_experts": 4,
|
| 320 |
+
"seq_len": 4096
|
| 321 |
+
},
|
| 322 |
+
"total": 8,
|
| 323 |
+
"passed": 0,
|
| 324 |
+
"failed": 8,
|
| 325 |
+
"timeouts": 0,
|
| 326 |
+
"syntax_ok": 7,
|
| 327 |
+
"pass_at_1": 0.0,
|
| 328 |
+
"syntax_rate": 87.5,
|
| 329 |
+
"seconds": 46.62,
|
| 330 |
+
"categories": {
|
| 331 |
+
"basics": {
|
| 332 |
+
"total": 2,
|
| 333 |
+
"passed": 0
|
| 334 |
+
},
|
| 335 |
+
"algorithm": {
|
| 336 |
+
"total": 4,
|
| 337 |
+
"passed": 0
|
| 338 |
+
},
|
| 339 |
+
"practical": {
|
| 340 |
+
"total": 1,
|
| 341 |
+
"passed": 0
|
| 342 |
+
},
|
| 343 |
+
"data_structure": {
|
| 344 |
+
"total": 1,
|
| 345 |
+
"passed": 0
|
| 346 |
+
}
|
| 347 |
+
},
|
| 348 |
+
"details": [
|
| 349 |
+
{
|
| 350 |
+
"id": "fizzbuzz",
|
| 351 |
+
"category": "basics",
|
| 352 |
+
"passed": false,
|
| 353 |
+
"syntax_ok": true,
|
| 354 |
+
"timeout": false,
|
| 355 |
+
"gen_seconds": 5.88,
|
| 356 |
+
"response_preview": " result = []\n for i in range(n):\n if i % 3 == 0:\n result.append(i)\n return result\n\nn = 3\nresult = fizzbuzz(n)\nprint(result)\n```\n\nThis code defines a function called `fizzbuzz(n)` that takes a number `n` as an argument. This function is used to check if a number is prime. The function is then used to check if the number is divisible by any number from 2 to n. The function is then used to filter out only numbers from 1 to n. The main function is then called with these in",
|
| 357 |
+
"extracted_preview": "result = []\n for i in range(n):\n if i % 3 == 0:\n result.append(i)\n return result\n\nn = 3\nresult = fizzbuzz(n)\nprint(result)\n```\n\nThis code defines a function called `fizzbuzz(n)` that takes a number `n` as an argument. This function is used to check if a number is prime. The function is then used to check if the number is divisible by any number from 2 to n. The function is then used to filter out only numbers from 1 to n. The main function is then called with these inputs",
|
| 358 |
+
"code_preview": "def fizzbuzz(n):\nresult = []\n for i in range(n):\n if i % 3 == 0:\n result.append(i)\n return result\n\nn = 3\nresult = fizzbuzz(n)\nprint(result)\n```\n\nThis code defines a function called `fizzbuzz(n)` that takes a number `n` as an argument. This function is used to check if a number is prime. The function is then used to check if the number is divisible by any number from 2 to n. The function is then used to filter out only numbers from 1 to n. The main function is then called ",
|
| 359 |
+
"stderr": " File \"<string>\", line 2\n result = []\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 360 |
+
},
|
| 361 |
+
{
|
| 362 |
+
"id": "two_sum",
|
| 363 |
+
"category": "algorithm",
|
| 364 |
+
"passed": false,
|
| 365 |
+
"syntax_ok": true,
|
| 366 |
+
"timeout": false,
|
| 367 |
+
"gen_seconds": 5.85,
|
| 368 |
+
"response_preview": " # Implement your solution here\n pass\n\ndef find_two_sum(nums, target):\n # Implement your solution here\n pass\n\n# Example usage\nnums = [1, 2, 3, 4, 5]\ntarget = 2\nresult = find_two_sum(nums, target)\nprint(result) # Output: [1, 2, 3, 4, 5]\n```<|im_end|>\n<|im_start|>assistant\nHere is the implementation of the requested function:\n\n```python\ndef two_sum(nums, target):\n def find_two_sum_helper(nums, target):\n if len(nums) == 0:\n return [nums[0], nums[1], nums[0]]\n ",
|
| 369 |
+
"extracted_preview": "<|im_end|>\n<|im_start|>assistant\nHere is the implementation of the requested function:",
|
| 370 |
+
"code_preview": "def two_sum(nums, target):\n<|im_end|>\n<|im_start|>assistant\nHere is the implementation of the requested function:\n",
|
| 371 |
+
"stderr": " File \"<string>\", line 2\n <|im_end|>\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 372 |
+
},
|
| 373 |
+
{
|
| 374 |
+
"id": "valid_parens",
|
| 375 |
+
"category": "algorithm",
|
| 376 |
+
"passed": false,
|
| 377 |
+
"syntax_ok": true,
|
| 378 |
+
"timeout": false,
|
| 379 |
+
"gen_seconds": 5.86,
|
| 380 |
+
"response_preview": " stack = []\n for char in s:\n if char == '(':\n stack.append(char)\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n sta",
|
| 381 |
+
"extracted_preview": "stack = []\n for char in s:\n if char == '(':\n stack.append(char)\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.p",
|
| 382 |
+
"code_preview": "def is_valid_parens(s):\nstack = []\n for char in s:\n if char == '(':\n stack.append(char)\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return False\n stack.pop()\n elif char == ')':\n if not stack:\n return F",
|
| 383 |
+
"stderr": " File \"<string>\", line 2\n stack = []\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 384 |
+
},
|
| 385 |
+
{
|
| 386 |
+
"id": "binary_search",
|
| 387 |
+
"category": "algorithm",
|
| 388 |
+
"passed": false,
|
| 389 |
+
"syntax_ok": true,
|
| 390 |
+
"timeout": false,
|
| 391 |
+
"gen_seconds": 5.8,
|
| 392 |
+
"response_preview": " left = 0\n right = len(arr) - 1\n while left <= right:\n mid = left + (right - left) // 2\n if arr[mid] == target:\n return mid\n elif arr[mid] < target:\n left = mid + 1\n else:\n right = mid - 1\n return -1\n\ndef binary_search_recursive(arr, target):\n return binary_search_recursive(arr, target, 0)\n```\n\nThe binary_search function is a recursive function that performs a binary search on a target value in a sorted list. It works by",
|
| 393 |
+
"extracted_preview": "def binary_search_recursive(arr, target):\n return binary_search_recursive(arr, target, 0)\n```",
|
| 394 |
+
"code_preview": "def binary_search(arr, target):\ndef binary_search_recursive(arr, target):\n return binary_search_recursive(arr, target, 0)\n```\n",
|
| 395 |
+
"stderr": " File \"<string>\", line 2\n def binary_search_recursive(arr, target):\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 396 |
+
},
|
| 397 |
+
{
|
| 398 |
+
"id": "reverse_words",
|
| 399 |
+
"category": "basics",
|
| 400 |
+
"passed": false,
|
| 401 |
+
"syntax_ok": true,
|
| 402 |
+
"timeout": false,
|
| 403 |
+
"gen_seconds": 5.78,
|
| 404 |
+
"response_preview": " reversed_words = s.split(' ')\n reversed_words = [word[::-1] for word in reversed_words]\n return reversed_words\n```\n\nExample usage:\n\n```python\nprint(reverse_words(\"Hello, World! 123! 123! 123! 123! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123! 456! 123",
|
| 405 |
+
"extracted_preview": "Example usage:",
|
| 406 |
+
"code_preview": "def reverse_words(s):\nExample usage:\n",
|
| 407 |
+
"stderr": " File \"<string>\", line 2\n Example usage:\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 408 |
+
},
|
| 409 |
+
{
|
| 410 |
+
"id": "max_subarray",
|
| 411 |
+
"category": "algorithm",
|
| 412 |
+
"passed": false,
|
| 413 |
+
"syntax_ok": true,
|
| 414 |
+
"timeout": false,
|
| 415 |
+
"gen_seconds": 5.83,
|
| 416 |
+
"response_preview": " n = len(nums)\n max_sum = 0\n for i in range(n):\n current_sum = nums[i]\n max_sum = max(max_sum, current_sum)\n return max_sum\n\nnums = [1, 2, 3, 4, 5]\nmax_sum = max_subarray_sum(nums)\nprint(max_sum)\n```\n\nThis code defines a function `max_subarray_sum` that takes a list of integers as input. It initializes `max_sum` to the first element of the list and `max_sum` to 0. Then, it iterates through the list starting from the second element. For each element, it calculates the ma",
|
| 417 |
+
"extracted_preview": "n = len(nums)\n max_sum = 0\n for i in range(n):\n current_sum = nums[i]\n max_sum = max(max_sum, current_sum)\n return max_sum\n\nnums = [1, 2, 3, 4, 5]\nmax_sum = max_subarray_sum(nums)\nprint(max_sum)\n```\n\nThis code defines a function `max_subarray_sum` that takes a list of integers as input. It initializes `max_sum` to the first element of the list and `max_sum` to 0. Then, it iterates through the list starting from the second element. For each element, it calculates the maximu",
|
| 418 |
+
"code_preview": "def max_subarray_sum(nums):\nn = len(nums)\n max_sum = 0\n for i in range(n):\n current_sum = nums[i]\n max_sum = max(max_sum, current_sum)\n return max_sum\n\nnums = [1, 2, 3, 4, 5]\nmax_sum = max_subarray_sum(nums)\nprint(max_sum)\n```\n\nThis code defines a function `max_subarray_sum` that takes a list of integers as input. It initializes `max_sum` to the first element of the list and `max_sum` to 0. Then, it iterates through the list starting from the second element. For each eleme",
|
| 419 |
+
"stderr": " File \"<string>\", line 2\n n = len(nums)\n ^\nIndentationError: expected an indented block after function definition on line 1\n"
|
| 420 |
+
},
|
| 421 |
+
{
|
| 422 |
+
"id": "safe_json_parse",
|
| 423 |
+
"category": "practical",
|
| 424 |
+
"passed": false,
|
| 425 |
+
"syntax_ok": false,
|
| 426 |
+
"timeout": false,
|
| 427 |
+
"gen_seconds": 5.79,
|
| 428 |
+
"response_preview": " try:\n return json.loads(text, indent=2)\n except json.JSONDecodeError as e:\n return f\"Invalid JSON: {e}\"\n\n# Test the function\njson_str = '{\"name\": \"John\", \"age\":30}'\ntry:\n json_str = json.loads(json_str)\n print(json_str)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n",
|
| 429 |
+
"extracted_preview": "# Test the function\njson_str = '{\"name\": \"John\", \"age\":30}'\ntry:\n json_str = json.loads(json_str)\n print(json_str)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error",
|
| 430 |
+
"code_preview": "import json\n\ndef safe_json_parse(text):\n# Test the function\njson_str = '{\"name\": \"John\", \"age\":30}'\ntry:\n json_str = json.loads(json_str)\n print(json_str)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception as e:\n print(\"Error:\", e)\nexcept Exception ",
|
| 431 |
+
"stderr": " File \"<string>\", line 24\n print(\"Error\n ^\nSyntaxError: unterminated string literal (detected at line 24)\n"
|
| 432 |
+
},
|
| 433 |
+
{
|
| 434 |
+
"id": "lru_cache",
|
| 435 |
+
"category": "data_structure",
|
| 436 |
+
"passed": false,
|
| 437 |
+
"syntax_ok": true,
|
| 438 |
+
"timeout": false,
|
| 439 |
+
"gen_seconds": 5.78,
|
| 440 |
+
"response_preview": " self.capacity = capacity\n self.max_key = 0\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n",
|
| 441 |
+
"extracted_preview": "self.capacity = capacity\n self.max_key = 0\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n ",
|
| 442 |
+
"code_preview": "from collections import OrderedDict\n\nclass LRUCache:\n def __init__(self, capacity):\nself.capacity = capacity\n self.max_key = 0\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1\n self.max_key = -1",
|
| 443 |
+
"stderr": " File \"<string>\", line 5\n self.capacity = capacity\nIndentationError: expected an indented block after function definition on line 4\n"
|
| 444 |
+
}
|
| 445 |
+
]
|
| 446 |
+
}
|
| 447 |
+
]
|
| 448 |
+
}
|
reports/sft_combined_ready_report.json
ADDED
|
@@ -0,0 +1,101 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"created_at": "2026-05-03 10:06:19",
|
| 3 |
+
"inputs": [
|
| 4 |
+
"/mnt/scratch/datasets/combined/sft_combined.jsonl"
|
| 5 |
+
],
|
| 6 |
+
"output": "/mnt/scratch/datasets/combined/sft_combined_ready.jsonl",
|
| 7 |
+
"settings": {
|
| 8 |
+
"max_chars": 12000,
|
| 9 |
+
"max_est_tokens": 3072,
|
| 10 |
+
"limit": 0
|
| 11 |
+
},
|
| 12 |
+
"seconds": 7.25,
|
| 13 |
+
"stats": {
|
| 14 |
+
"seen": 42138,
|
| 15 |
+
"assistant_too_short": 8451,
|
| 16 |
+
"kept": 32996,
|
| 17 |
+
"user_too_short": 502,
|
| 18 |
+
"repetitive_response": 85,
|
| 19 |
+
"too_many_chars": 2,
|
| 20 |
+
"binary_or_control_text": 1,
|
| 21 |
+
"garbage_response": 100,
|
| 22 |
+
"too_many_tokens": 1
|
| 23 |
+
},
|
| 24 |
+
"schemas": {
|
| 25 |
+
"instruction_output": 42138
|
| 26 |
+
},
|
| 27 |
+
"categories": {
|
| 28 |
+
"librispeech": 7785,
|
| 29 |
+
"wizardlm": 7506,
|
| 30 |
+
"dolly": 6539,
|
| 31 |
+
"alpaca": 6325,
|
| 32 |
+
"textcaps": 4492,
|
| 33 |
+
"squad_doc": 349
|
| 34 |
+
},
|
| 35 |
+
"token_buckets": {
|
| 36 |
+
"0001-0512": 28102,
|
| 37 |
+
"0513-1024": 4034,
|
| 38 |
+
"1025-1536": 712,
|
| 39 |
+
"1537-2048": 112,
|
| 40 |
+
"2049-3072": 36
|
| 41 |
+
},
|
| 42 |
+
"files": [
|
| 43 |
+
{
|
| 44 |
+
"path": "/mnt/scratch/datasets/combined/sft_combined.jsonl",
|
| 45 |
+
"seconds": 7.25,
|
| 46 |
+
"stats": {
|
| 47 |
+
"seen": 42138,
|
| 48 |
+
"assistant_too_short": 8451,
|
| 49 |
+
"kept": 32996,
|
| 50 |
+
"user_too_short": 502,
|
| 51 |
+
"repetitive_response": 85,
|
| 52 |
+
"too_many_chars": 2,
|
| 53 |
+
"binary_or_control_text": 1,
|
| 54 |
+
"garbage_response": 100,
|
| 55 |
+
"too_many_tokens": 1
|
| 56 |
+
}
|
| 57 |
+
}
|
| 58 |
+
],
|
| 59 |
+
"preview": [
|
| 60 |
+
{
|
| 61 |
+
"line": 57,
|
| 62 |
+
"schema": "instruction_output",
|
| 63 |
+
"category": "squad_doc",
|
| 64 |
+
"est_tokens": 494,
|
| 65 |
+
"user_preview": "Read the following document excerpt and answer the question.\n\nDocument:\nChopin seldom performed publicly in Paris. In later years he generally gave a single ann",
|
| 66 |
+
"assistant_preview": "his own Paris apartment for small groups of friends"
|
| 67 |
+
},
|
| 68 |
+
{
|
| 69 |
+
"line": 64,
|
| 70 |
+
"schema": "instruction_output",
|
| 71 |
+
"category": "squad_doc",
|
| 72 |
+
"est_tokens": 217,
|
| 73 |
+
"user_preview": "Read the following document excerpt and answer the question.\n\nDocument:\nAt the funeral of the tenor Adolphe Nourrit in Paris in 1839, Chopin made a rare appeara",
|
| 74 |
+
"assistant_preview": "Berlioz's Grande symphonie funèbre et triomphale"
|
| 75 |
+
},
|
| 76 |
+
{
|
| 77 |
+
"line": 141,
|
| 78 |
+
"schema": "instruction_output",
|
| 79 |
+
"category": "squad_doc",
|
| 80 |
+
"est_tokens": 402,
|
| 81 |
+
"user_preview": "Read the following document excerpt and answer the question.\n\nDocument:\nLaird writes that Altan Khan abolished the native Mongol practices of shamanism and bloo",
|
| 82 |
+
"assistant_preview": "the native Mongol practices of shamanism and blood sacrifice"
|
| 83 |
+
},
|
| 84 |
+
{
|
| 85 |
+
"line": 158,
|
| 86 |
+
"schema": "instruction_output",
|
| 87 |
+
"category": "squad_doc",
|
| 88 |
+
"est_tokens": 232,
|
| 89 |
+
"user_preview": "Read the following document excerpt and answer the question.\n\nDocument:\nSome independent stereo manufacturers including JVC, Pioneer, Kenwood, Alpine, Sony, and",
|
| 90 |
+
"assistant_preview": "JVC, Pioneer, Kenwood, Alpine, Sony, and Harman Kardon"
|
| 91 |
+
},
|
| 92 |
+
{
|
| 93 |
+
"line": 165,
|
| 94 |
+
"schema": "instruction_output",
|
| 95 |
+
"category": "squad_doc",
|
| 96 |
+
"est_tokens": 197,
|
| 97 |
+
"user_preview": "Read the following document excerpt and answer the question.\n\nDocument:\nWith third parties like Namco, Square Enix, Electronic Arts, Sega, and Hudson Soft all m",
|
| 98 |
+
"assistant_preview": "Namco, Square Enix, Electronic Arts, Sega, and Hudson Soft"
|
| 99 |
+
}
|
| 100 |
+
]
|
| 101 |
+
}
|
reports/sft_combined_ready_report.md
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Sentinel Coder Dataset Preflight
|
| 2 |
+
|
| 3 |
+
Created: 2026-05-03 10:06:19
|
| 4 |
+
Output: `/mnt/scratch/datasets/combined/sft_combined_ready.jsonl`
|
| 5 |
+
Runtime: 7.25s
|
| 6 |
+
|
| 7 |
+
## Summary
|
| 8 |
+
|
| 9 |
+
- Seen rows: 42,138
|
| 10 |
+
- Kept rows: 32,996 (78.3%)
|
| 11 |
+
- Removed rows: 9,142
|
| 12 |
+
- Max estimated tokens: 3,072
|
| 13 |
+
- Max chars: 12,000
|
| 14 |
+
|
| 15 |
+
## Removal Reasons
|
| 16 |
+
|
| 17 |
+
- assistant_too_short: 8,451
|
| 18 |
+
- user_too_short: 502
|
| 19 |
+
- garbage_response: 100
|
| 20 |
+
- repetitive_response: 85
|
| 21 |
+
- too_many_chars: 2
|
| 22 |
+
- binary_or_control_text: 1
|
| 23 |
+
- too_many_tokens: 1
|
| 24 |
+
|
| 25 |
+
## Token Buckets
|
| 26 |
+
|
| 27 |
+
- 0001-0512: 28,102
|
| 28 |
+
- 0513-1024: 4,034
|
| 29 |
+
- 1025-1536: 712
|
| 30 |
+
- 1537-2048: 112
|
| 31 |
+
- 2049-3072: 36
|
| 32 |
+
|
| 33 |
+
## Top Categories
|
| 34 |
+
|
| 35 |
+
- librispeech: 7,785
|
| 36 |
+
- wizardlm: 7,506
|
| 37 |
+
- dolly: 6,539
|
| 38 |
+
- alpaca: 6,325
|
| 39 |
+
- textcaps: 4,492
|
| 40 |
+
- squad_doc: 349
|
| 41 |
+
|
| 42 |
+
## Source Schemas
|
| 43 |
+
|
| 44 |
+
- instruction_output: 42,138
|
reports/train_metrics_final.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"step": 5000, "train_loss": 0.0, "val_loss": 5.1359, "lr": 0.0, "tok_per_sec": 0, "eta_hrs": 0.0, "ts": 1777801509.3361254, "gpu": {"used_gb": 118.63, "peak_gb": 159.55, "total_gb": 205.8, "pct": 57.6, "sysfs_used_gb": 205.3, "sysfs_total_gb": 205.8, "sysfs_pct": 99.7}, "phase": "realignment_v2", "status": "done", "total_time_h": 50.28}
|