Clean model card for public Graphite 1.0 4B release

Files changed (1) hide show

README.md +42 -45

README.md CHANGED Viewed

@@ -33,8 +33,6 @@ tags:
 - logic and factual precision
 - bilingual Russian / English instruction following
-This repository keeps the legacy slug `obsidian-critic-qwen35-4b-base-lora` because that was the original public upload target, but the public model name for documentation and grant material is **Graphite 1.0 4B**.
 ## What This Repository Contains
 This repo contains a **LoRA adapter**, not merged base weights.
@@ -57,12 +55,11 @@ Files of interest:
 ## Training Lineage
-This adapter corresponds to the **first public Kaggle 2xT4 full fine-tune stream** before the later `Graphite 1.1` reweight experiment. For provenance purposes:
-- the original dataset family is **`obsidian-critic-broad-mix-20260321`**
-- the training stack is **Unsloth + TRL + torchrun DDP on dual T4**
-- the public upload target for this run was this repo
-- later `Graphite 1.1` experiments are intentionally excluded from this card
 Notebook lineage used for this stream:
@@ -74,7 +71,6 @@ Notebook lineage used for this stream:
 The training data for this first public stream comes from the mixed dataset:
 - dataset name: `obsidian-critic-broad-mix-20260321`
-- local source dir: `/home/starred/datasets/obsidian-critic-broad-mix-20260321`
 - examples in mixed dataset: `37,008`
 - approximate token volume: `6,885,960`
 - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
@@ -111,33 +107,42 @@ The public training run then created a deterministic train / validation split an
 | `wave_backfill` | 230 | 218,922 |
 | `long_context` | 24 | 350,954 |
-### Source Datasets Included In The Broad Mix
-- **`obsidian_docs`**: `docs-markdown-sft-20260318-v3`, `docs-engineering-review-topup-sft-20260320`, `docs-topup-sft-20260320`, `docs-topup-ru-sft-20260320`
-- **`tool_use`**: `format-tool-discipline-sft-20260319`, `code-agent-tooluse-sft-20260319`, `code-agent-tooluse-ru-topup-sft-20260320`
-- **`greenfield`**: `code-architecture-sft-20260319`, `ts-rust-coding-sft-20260318-v3`
-- **`repair`**: `runtime-debug-grounded-sft-20260319`, `multi-file-repo-repair-sft-20260319`, `code-repair-patch-sft-20260319`, `code-fix-critical-topup-sft-20260321`
-- **`review`**: `security-repair-review-sft-20260319`, `ts-rust-code-review-sft-20260318-v3`
-- **`integration`**: `db-and-migrations-sft-20260319`, `backend-frontend-ops-sft-20260319`
-- **`reasoning`**: `tdd-test-first-sft-20260319`, `multi-step-debug-sft-20260319`
-- **`agent_core`**: `agent-gap-fixes-sft-20260320`, `agent-gap-fixes-ru-topup-sft-20260320`
-- **`robustness`**: `robustness-noise-traps-sft-20260320`, `robustness-noise-traps-ru-topup-sft-20260320`
-- **`core_real`**: `real-world-grounded-topup-sft-20260320`, `real-world-seed-expansion-sft-20260321`
-- **`long_context`**: `long-context-memory-topup-sft-20260321`
-- **`regularizer`**: `anti-overthinking-pack-sft-20260321`
-- **`logic`**: `logic-core-sft-20260319`, `logic-sanity-sft-20260319`, `logic-precision-ru-sft-20260319`
-- **`factual`**: `factual-erudition-sft-20260319`
-- **`wave_backfill`**: `wave-01-growth-sft-20260319`, `wave-02-growth-sft-20260319`, `wave-03-growth-sft-20260320`
-Explicitly excluded from the mix build:
-- `anti-regression-eval-20260319`
-- `curated-code-train-mix-20260320`
-- `graphite-1.0-code-train-mix-20260321`
-- `css-ui-premium-sft-20260319`
-- `css-style-premium-sft-20260320`
-- `css-style-premium-ru-topup-sft-20260320`
-- `css-style-sft-20260318`
 ## Representative Training Examples
@@ -235,9 +240,8 @@ The first patch hit the wrong seam. The new signal points back to `app/config.py
 ## Training Recipe
-The public run in this repo used:
-- hardware: **Kaggle dual T4**
 - distributed setup: **`torchrun` DDP**
 - training framework: **Unsloth + TRL**
 - base model loading: **4-bit**
@@ -260,11 +264,6 @@ The public run in this repo used:
 - public run total steps: **2256**
 - logging / eval / save cadence: **50 / 125 / 250**
-Best public checkpoint recorded in `trainer_state.json`:
-- best checkpoint: `checkpoint-2250`
-- best metric: `0.18876151740550995`
 ## Prompt Style
 This adapter was trained on a simple, explicit prompt layout:
@@ -320,7 +319,7 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
 Graphite 1.0 4B is intended for:
-- local and server-side coding assistants
 - repo triage and patch-planning copilots
 - Markdown / docs tooling assistants
 - logic and wording critique
@@ -334,7 +333,6 @@ It is especially useful when you want **short, grounded, non-theatrical outputs*
 - It is tuned for **structured technical work**, not general consumer chat.
 - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
 - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
-- This card documents the **first public stream only**. Later `Graphite 1.1` experiments are intentionally excluded.
 ## License
@@ -349,4 +347,3 @@ Please also review the license and usage terms of the base model:
 - Alibaba Qwen team for the base model
 - Unsloth for the efficient LoRA training stack
 - TRL / Transformers / PEFT / PyTorch maintainers
-- Kaggle dual-T4 environment used for the public training run

 - logic and factual precision
 - bilingual Russian / English instruction following
 ## What This Repository Contains
 This repo contains a **LoRA adapter**, not merged base weights.
 ## Training Lineage
+This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.
+- dataset family: **`obsidian-critic-broad-mix-20260321`**
+- training stack: **Unsloth + TRL + torchrun DDP**
+- base model: **`Qwen/Qwen3.5-4B-Base`**
 Notebook lineage used for this stream:
 The training data for this first public stream comes from the mixed dataset:
 - dataset name: `obsidian-critic-broad-mix-20260321`
 - examples in mixed dataset: `37,008`
 - approximate token volume: `6,885,960`
 - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
 | `wave_backfill` | 230 | 218,922 |
 | `long_context` | 24 | 350,954 |
+### Source Dataset Table
+| Dataset | Role | Examples | Approx. tokens |
+| --- | --- | ---: | ---: |
+| `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
+| `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
+| `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
+| `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
+| `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
+| `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
+| `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
+| `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
+| `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
+| `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
+| `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
+| `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
+| `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
+| `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
+| `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
+| `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
+| `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
+| `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
+| `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
+| `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
+| `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
+| `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
+| `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
+| `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
+| `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
+| `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
+| `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
+| `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
+| `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
+| `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
+| `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
+| `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |
 ## Representative Training Examples
 ## Training Recipe
+The public run used:
 - distributed setup: **`torchrun` DDP**
 - training framework: **Unsloth + TRL**
 - base model loading: **4-bit**
 - public run total steps: **2256**
 - logging / eval / save cadence: **50 / 125 / 250**
 ## Prompt Style
 This adapter was trained on a simple, explicit prompt layout:
 Graphite 1.0 4B is intended for:
+- coding assistants
 - repo triage and patch-planning copilots
 - Markdown / docs tooling assistants
 - logic and wording critique
 - It is tuned for **structured technical work**, not general consumer chat.
 - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
 - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
 ## License
 - Alibaba Qwen team for the base model
 - Unsloth for the efficient LoRA training stack
 - TRL / Transformers / PEFT / PyTorch maintainers