Starred09 commited on
Commit
68ada46
·
1 Parent(s): 05e5a00

Clean model card for public Graphite 1.0 4B release

Browse files
Files changed (1) hide show
  1. README.md +42 -45
README.md CHANGED
@@ -33,8 +33,6 @@ tags:
33
  - logic and factual precision
34
  - bilingual Russian / English instruction following
35
 
36
- This repository keeps the legacy slug `obsidian-critic-qwen35-4b-base-lora` because that was the original public upload target, but the public model name for documentation and grant material is **Graphite 1.0 4B**.
37
-
38
  ## What This Repository Contains
39
 
40
  This repo contains a **LoRA adapter**, not merged base weights.
@@ -57,12 +55,11 @@ Files of interest:
57
 
58
  ## Training Lineage
59
 
60
- This adapter corresponds to the **first public Kaggle 2xT4 full fine-tune stream** before the later `Graphite 1.1` reweight experiment. For provenance purposes:
61
 
62
- - the original dataset family is **`obsidian-critic-broad-mix-20260321`**
63
- - the training stack is **Unsloth + TRL + torchrun DDP on dual T4**
64
- - the public upload target for this run was this repo
65
- - later `Graphite 1.1` experiments are intentionally excluded from this card
66
 
67
  Notebook lineage used for this stream:
68
 
@@ -74,7 +71,6 @@ Notebook lineage used for this stream:
74
  The training data for this first public stream comes from the mixed dataset:
75
 
76
  - dataset name: `obsidian-critic-broad-mix-20260321`
77
- - local source dir: `/home/starred/datasets/obsidian-critic-broad-mix-20260321`
78
  - examples in mixed dataset: `37,008`
79
  - approximate token volume: `6,885,960`
80
  - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
@@ -111,33 +107,42 @@ The public training run then created a deterministic train / validation split an
111
  | `wave_backfill` | 230 | 218,922 |
112
  | `long_context` | 24 | 350,954 |
113
 
114
- ### Source Datasets Included In The Broad Mix
115
-
116
- - **`obsidian_docs`**: `docs-markdown-sft-20260318-v3`, `docs-engineering-review-topup-sft-20260320`, `docs-topup-sft-20260320`, `docs-topup-ru-sft-20260320`
117
- - **`tool_use`**: `format-tool-discipline-sft-20260319`, `code-agent-tooluse-sft-20260319`, `code-agent-tooluse-ru-topup-sft-20260320`
118
- - **`greenfield`**: `code-architecture-sft-20260319`, `ts-rust-coding-sft-20260318-v3`
119
- - **`repair`**: `runtime-debug-grounded-sft-20260319`, `multi-file-repo-repair-sft-20260319`, `code-repair-patch-sft-20260319`, `code-fix-critical-topup-sft-20260321`
120
- - **`review`**: `security-repair-review-sft-20260319`, `ts-rust-code-review-sft-20260318-v3`
121
- - **`integration`**: `db-and-migrations-sft-20260319`, `backend-frontend-ops-sft-20260319`
122
- - **`reasoning`**: `tdd-test-first-sft-20260319`, `multi-step-debug-sft-20260319`
123
- - **`agent_core`**: `agent-gap-fixes-sft-20260320`, `agent-gap-fixes-ru-topup-sft-20260320`
124
- - **`robustness`**: `robustness-noise-traps-sft-20260320`, `robustness-noise-traps-ru-topup-sft-20260320`
125
- - **`core_real`**: `real-world-grounded-topup-sft-20260320`, `real-world-seed-expansion-sft-20260321`
126
- - **`long_context`**: `long-context-memory-topup-sft-20260321`
127
- - **`regularizer`**: `anti-overthinking-pack-sft-20260321`
128
- - **`logic`**: `logic-core-sft-20260319`, `logic-sanity-sft-20260319`, `logic-precision-ru-sft-20260319`
129
- - **`factual`**: `factual-erudition-sft-20260319`
130
- - **`wave_backfill`**: `wave-01-growth-sft-20260319`, `wave-02-growth-sft-20260319`, `wave-03-growth-sft-20260320`
131
-
132
- Explicitly excluded from the mix build:
133
-
134
- - `anti-regression-eval-20260319`
135
- - `curated-code-train-mix-20260320`
136
- - `graphite-1.0-code-train-mix-20260321`
137
- - `css-ui-premium-sft-20260319`
138
- - `css-style-premium-sft-20260320`
139
- - `css-style-premium-ru-topup-sft-20260320`
140
- - `css-style-sft-20260318`
 
 
 
 
 
 
 
 
 
141
 
142
  ## Representative Training Examples
143
 
@@ -235,9 +240,8 @@ The first patch hit the wrong seam. The new signal points back to `app/config.py
235
 
236
  ## Training Recipe
237
 
238
- The public run in this repo used:
239
 
240
- - hardware: **Kaggle dual T4**
241
  - distributed setup: **`torchrun` DDP**
242
  - training framework: **Unsloth + TRL**
243
  - base model loading: **4-bit**
@@ -260,11 +264,6 @@ The public run in this repo used:
260
  - public run total steps: **2256**
261
  - logging / eval / save cadence: **50 / 125 / 250**
262
 
263
- Best public checkpoint recorded in `trainer_state.json`:
264
-
265
- - best checkpoint: `checkpoint-2250`
266
- - best metric: `0.18876151740550995`
267
-
268
  ## Prompt Style
269
 
270
  This adapter was trained on a simple, explicit prompt layout:
@@ -320,7 +319,7 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
320
 
321
  Graphite 1.0 4B is intended for:
322
 
323
- - local and server-side coding assistants
324
  - repo triage and patch-planning copilots
325
  - Markdown / docs tooling assistants
326
  - logic and wording critique
@@ -334,7 +333,6 @@ It is especially useful when you want **short, grounded, non-theatrical outputs*
334
  - It is tuned for **structured technical work**, not general consumer chat.
335
  - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
336
  - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
337
- - This card documents the **first public stream only**. Later `Graphite 1.1` experiments are intentionally excluded.
338
 
339
  ## License
340
 
@@ -349,4 +347,3 @@ Please also review the license and usage terms of the base model:
349
  - Alibaba Qwen team for the base model
350
  - Unsloth for the efficient LoRA training stack
351
  - TRL / Transformers / PEFT / PyTorch maintainers
352
- - Kaggle dual-T4 environment used for the public training run
 
33
  - logic and factual precision
34
  - bilingual Russian / English instruction following
35
 
 
 
36
  ## What This Repository Contains
37
 
38
  This repo contains a **LoRA adapter**, not merged base weights.
 
55
 
56
  ## Training Lineage
57
 
58
+ This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.
59
 
60
+ - dataset family: **`obsidian-critic-broad-mix-20260321`**
61
+ - training stack: **Unsloth + TRL + torchrun DDP**
62
+ - base model: **`Qwen/Qwen3.5-4B-Base`**
 
63
 
64
  Notebook lineage used for this stream:
65
 
 
71
  The training data for this first public stream comes from the mixed dataset:
72
 
73
  - dataset name: `obsidian-critic-broad-mix-20260321`
 
74
  - examples in mixed dataset: `37,008`
75
  - approximate token volume: `6,885,960`
76
  - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
 
107
  | `wave_backfill` | 230 | 218,922 |
108
  | `long_context` | 24 | 350,954 |
109
 
110
+ ### Source Dataset Table
111
+
112
+ | Dataset | Role | Examples | Approx. tokens |
113
+ | --- | --- | ---: | ---: |
114
+ | `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
115
+ | `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
116
+ | `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
117
+ | `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
118
+ | `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
119
+ | `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
120
+ | `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
121
+ | `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
122
+ | `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
123
+ | `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
124
+ | `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
125
+ | `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
126
+ | `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
127
+ | `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
128
+ | `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
129
+ | `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
130
+ | `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
131
+ | `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
132
+ | `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
133
+ | `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
134
+ | `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
135
+ | `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
136
+ | `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
137
+ | `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
138
+ | `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
139
+ | `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
140
+ | `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
141
+ | `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
142
+ | `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
143
+ | `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
144
+ | `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
145
+ | `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |
146
 
147
  ## Representative Training Examples
148
 
 
240
 
241
  ## Training Recipe
242
 
243
+ The public run used:
244
 
 
245
  - distributed setup: **`torchrun` DDP**
246
  - training framework: **Unsloth + TRL**
247
  - base model loading: **4-bit**
 
264
  - public run total steps: **2256**
265
  - logging / eval / save cadence: **50 / 125 / 250**
266
 
 
 
 
 
 
267
  ## Prompt Style
268
 
269
  This adapter was trained on a simple, explicit prompt layout:
 
319
 
320
  Graphite 1.0 4B is intended for:
321
 
322
+ - coding assistants
323
  - repo triage and patch-planning copilots
324
  - Markdown / docs tooling assistants
325
  - logic and wording critique
 
333
  - It is tuned for **structured technical work**, not general consumer chat.
334
  - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
335
  - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
 
336
 
337
  ## License
338
 
 
347
  - Alibaba Qwen team for the base model
348
  - Unsloth for the efficient LoRA training stack
349
  - TRL / Transformers / PEFT / PyTorch maintainers