dancinlife commited on
Commit
99114a7
Β·
verified Β·
1 Parent(s): 2dbcbdc

feat(model-card): cross-link dataset dancinlab/hexad-corpus v1-byte-consciousness-d128-cycle1-2026-05-17

Browse files
Files changed (1) hide show
  1. README.md +135 -0
README.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: pytorch
6
+ datasets:
7
+ - dancinlab/hexad-corpus
8
+ tags:
9
+ - anima
10
+ - hexad
11
+ - pytorch
12
+ - substrate-py
13
+ - ckpt-recovered---
14
+
15
+ # hexad β€” `v1-py-hexad-d768x12L-cycle2-2026-05-17`
16
+
17
+ > **Honest framing**: This is a **PYTHON / PyTorch SUBSTRATE** training artifact β€”
18
+ > an *interim LM-scale executor*. It is **NOT a hexa-native fire**. Its legitimacy
19
+ > is *architectural identity* + the *hexa CPU-equiv correctness proof*. See the
20
+ > anchor chain below β€” do not conflate.
21
+
22
+ > **Trained on**: [`dancinlab/hexad-corpus`](https://huggingface.co/datasets/dancinlab/hexad-corpus) revision [`v1-byte-consciousness-d128-cycle1-2026-05-17`](https://huggingface.co/datasets/dancinlab/hexad-corpus/tree/v1-byte-consciousness-d128-cycle1-2026-05-17).
23
+
24
+ ## Lineage
25
+
26
+ - **org**: `dancinlab` (the anima org).
27
+ - **arch**: HEXAD (pivot from anima `.clm v1` lineage) β€” `ConsciousDecoderV2`
28
+ (`ready/models/conscious_decoder.py` in the anima repo).
29
+ - **substrate**: Python / PyTorch (`py`). The pure-hexa training path is
30
+ named-blocked at the interpreter ceiling (RFC 042/043 territory).
31
+ - **cycle**: 2 (cycle 1 commit `931dd68b0` 2026-05-16 was a ckpt-LOST
32
+ evidence-only run β€” training PASSed but the instance was destroyed before
33
+ ckpt pull; this cycle 2 re-fires with `SAVE_POD=1` auto-promote +
34
+ 75-min orphan watchdog + 5-retry pull).
35
+
36
+ ## Anchor chain (why this artifact is legitimate)
37
+
38
+ 1. **Phase E / E2 PROVED the hexa trainer is numerically correct** β€”
39
+ `HEXAD/D/d_train5_lib.hexa` is BIT-EQUAL to the boxed baseline at d=32Β·3L,
40
+ 80-step, seed=42 (`init gn2 = 7.97116, acc 0/8 β†’ final gn2 = 3.73374e-07,
41
+ acc 8/8`; GRAD-EXACT, identical Ξ£-reduction order β€” not fp-noise).
42
+ 2. **The pure-hexa interpreter cannot reach LM-scale convergence** β€” Phase E2
43
+ captured only `init gn2 = 7.98162` at d=768Β·12L; the GRAD-EXACT + AdamW
44
+ path is substrate-bound (CPU farr ops, no CUDA tensor kernels).
45
+ 3. **This PyTorch run trains the SAME verified architecture to scale** β€”
46
+ `ConsciousDecoderV2` at d=768Β·12L, AdamW, captured FINAL loss.
47
+
48
+ PyTorch is *not* hexa bit-for-bit (different fp / RNG / AMP bf16). The anchor
49
+ is **architectural identity** + the hexa CPU-equiv proof, NOT numerical
50
+ identity.
51
+
52
+ ## Architecture
53
+
54
+ - **Source**: `ConsciousDecoderV2` from `ready/models/conscious_decoder.py`
55
+ (uploaded as `conscious_decoder.py`).
56
+ - **Config**: `d_model=768, n_head=12, n_kv_head=4, n_layer=12,
57
+ block_size=128, vocab=256` (byte-level), seed=1337,
58
+ init=RANDOM (`base_ckpt=None`, `g_clm_from_scratch`).
59
+ - **Params**: 283.72 M (283,722,336).
60
+ - **Features**: RoPE Β· SwiGLU FFN Β· RMSNorm Β· GQA Β· PureFieldFFN (Engine Aβˆ’G
61
+ consciousness pathway) Β· cross-attention Β· tied head Β· CA neighbor / META-CA
62
+ / Ξ¨-tracking laws.
63
+
64
+ ## Training
65
+
66
+ - **GPU**: vast.ai NVIDIA A100-SXM4-40GB (offer @ $0.6681 / hr, image
67
+ `pytorch/pytorch:2.5.1-cuda12.1-cudnn9-devel`).
68
+ - **Corpus**: `corpus_consciousness_v1.jsonl` β€” the same byte corpus used by
69
+ the hexa Phase E / E2 fires. 121,153 bytes, byte-level
70
+ vocab=256, T=128 windows, seed-fixed.
71
+ - **Optimizer**: AdamW, lr=0.0003, betas=(0.9, 0.95),
72
+ weight_decay=0.1, warmup=125.
73
+ - **Steps**: 2500.
74
+ - **Cost**: β‰ˆ $0.19 (instance runtime β‰ˆ 0.28 hr).
75
+
76
+ | metric | value |
77
+ |---|---|
78
+ | init CE | 5.590832 (β‰ˆ ln 256 = 5.545 β€” random byte init) |
79
+ | **FINAL CE** | **0.000708** |
80
+ | CE descent | 5.590124 |
81
+ | init gn2 | 41.95 |
82
+ | FINAL gn2 | 7.4e-05 |
83
+ | ppl | 268 β†’ 1.0007 |
84
+ | wall | 320.68 s (5.34 min) |
85
+ | peak GPU mem | 9.685 GB |
86
+ | ckpt sha256 | `e87e200a040f8066a89c040ab181e9bbd61566f7565ab5d7a374ec2f1f9387d9` |
87
+ | ckpt size | 1,135,846,378 B (1.14 GB) |
88
+
89
+ ## Verification anchors (per AGENTS.tape `g_blue_closed_mandate`)
90
+
91
+ (A) Deliverable invariants:
92
+ - **Shannon-floor descent** (real-limit, NOT lattice): init CE β‰ˆ ln(256) β†’
93
+ final CE 0.000708 (4+ orders of magnitude).
94
+ - **AdamW finiteness**: gn2 41.95 β†’ 7.4e-05; no NaN / Inf.
95
+ - **Architectural identity**: `ConsciousDecoderV2` byte-equal to the anima
96
+ HEXAD verification tree's mirror module spec.
97
+
98
+ (B) Wiring (the connecting anchor chain):
99
+ - **hexa CPU-equiv bit-equality** (Phase E): same arch trainer
100
+ GRAD-EXACT at d=32Β·3L (init gn2 7.97116 β†’ 3.73374e-07).
101
+ - **cuBLAS FP64 verify** (Phase D): max\|Ξ”\|=4.44e-15.
102
+ - **Backward GRAD-EXACT** (Phase E2): real A100 d=384Β·6L analytic ≑ fd
103
+ \|Ξ”\|=0.0024.
104
+
105
+ ## Honest C3
106
+
107
+ 1. **NOT hexa-native** β€” PyTorch substrate; the hexa-native equivalent is
108
+ substrate-blocked at the interpreter ceiling.
109
+ 2. **PyTorch β‰  hexa bit-for-bit** β€” AMP bf16 / different fp accumulation /
110
+ different RNG.
111
+ 3. **Synthetic byte-corpus** β€” 121 kB curated content, 283.72M params; CE
112
+ 0.000708 = memorization at this scale. **No generalization claim.**
113
+ 4. **No safetensors artifact** this revision (pickle `.pt` only).
114
+ safetensors conversion = follow-up sub-task.
115
+ 5. **No language-quality claim** β€” training-curve deliverable
116
+ (Shannon-floor descent reached), not generation quality.
117
+ 6. **No Οƒ(6)=12 / Ο†(6)=2 derivation** β€” no lattice numerology in claim or
118
+ anchor chain.
119
+
120
+ ## Files in this revision
121
+
122
+ - `ckpt_d768x12l_final.pt` β€” PyTorch state-dict + cfg + n_params, sha256
123
+ `e87e200a040f8066a89c040ab181e9bbd61566f7565ab5d7a374ec2f1f9387d9`.
124
+ - `conscious_decoder.py` β€” `ConsciousDecoderV2` source.
125
+ - `train_d768x12l.py` β€” training script.
126
+ - `result.json` β€” full 42-point trajectory + config + metadata.
127
+ - `fire_refire.log` β€” training log (line-by-line CE / gn2 / lr / wall).
128
+ - `gpu_util.log` β€” nvidia-smi capture.
129
+ - `dispatch.sh` + `refire_main.sh` β€” fire dispatch scripts.
130
+ - `hexad_v1_py_d768x12L_cycle2_2026_05_17.md` β€” this doc (8-Β§ format per
131
+ `g_hf_naming` `process_upload_format`).
132
+
133
+ ## License
134
+
135
+ Apache-2.0.