| --- |
| license: apache-2.0 |
| base_model: Qwen/Qwen3.5-4B-Base |
| library_name: peft |
| pipeline_tag: text-generation |
| model_name: Graphite 1.0 4B |
| language: |
| - en |
| - ru |
| tags: |
| - qwen |
| - qwen3.5 |
| - peft |
| - lora |
| - unsloth |
| - trl |
| - sft |
| - code |
| - reasoning |
| - bilingual |
| - obsidian |
| - graphite |
| --- |
| |
| # Graphite 1.0 4B |
|
|
| `Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across: |
|
|
| - repo repair and debugging |
| - agent tool-use formatting |
| - technical writing and Markdown workflows |
| - code review and integration tasks |
| - logic and factual precision |
| - bilingual Russian / English instruction following |
|
|
| ## What This Repository Contains |
|
|
| This repo contains a **LoRA adapter**, not merged base weights. |
|
|
| - Base model: `Qwen/Qwen3.5-4B-Base` |
| - Adapter type: `LoRA` |
| - Rank: `r=16` |
| - Alpha: `16` |
| - Dropout: `0.0` |
| - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
|
|
| Files of interest: |
|
|
| - `adapter_model.safetensors`: LoRA weights |
| - `adapter_config.json`: PEFT adapter config |
| - `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets |
| - `run_summary.json`: public training run summary |
| - `length_stats.json`: length filtering summary |
| - `masking_sanity.json`: formatting sanity check |
|
|
| ## Training Lineage |
|
|
| This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**. |
|
|
| - dataset family: **`obsidian-critic-broad-mix-20260321`** |
| - training stack: **Unsloth + TRL + torchrun DDP** |
| - base model: **`Qwen/Qwen3.5-4B-Base`** |
|
|
| Notebook lineage used for this stream: |
|
|
| - `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix |
| - `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run |
|
|
| ## Dataset Provenance |
|
|
| The training data for this first public stream comes from the mixed dataset: |
|
|
| - dataset name: `obsidian-critic-broad-mix-20260321` |
| - examples in mixed dataset: `37,008` |
| - approximate token volume: `6,885,960` |
| - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469` |
| - normalized near-duplicates removed from wave backfill rows: `201` |
| - dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a` |
|
|
| The public training run then created a deterministic train / validation split and applied sequence-length filtering: |
|
|
| - train rows before filter: `36,638` |
| - validation rows before filter: `370` |
| - train rows after filter: `36,081` |
| - validation rows after filter: `363` |
| - removed for length filtering: `564` |
| - minimum kept sequence length: `48` |
| - maximum kept sequence length: `2048` |
|
|
| ### Mix Roles |
|
|
| | Role | Examples | Approx. tokens | |
| | --- | ---: | ---: | |
| | `repair` | 5,353 | 983,890 | |
| | `tool_use` | 4,682 | 455,600 | |
| | `core_real` | 4,200 | 1,043,187 | |
| | `robustness` | 3,600 | 397,399 | |
| | `agent_core` | 3,200 | 645,426 | |
| | `logic` | 3,031 | 297,812 | |
| | `factual` | 2,960 | 142,787 | |
| | `obsidian_docs` | 2,740 | 490,850 | |
| | `reasoning` | 2,200 | 655,624 | |
| | `greenfield` | 1,488 | 563,331 | |
| | `integration` | 1,473 | 241,331 | |
| | `review` | 1,327 | 343,007 | |
| | `regularizer` | 500 | 55,840 | |
| | `wave_backfill` | 230 | 218,922 | |
| | `long_context` | 24 | 350,954 | |
|
|
| ### Source Dataset Table |
|
|
| | Dataset | Role | Examples | Approx. tokens | |
| | --- | --- | ---: | ---: | |
| | `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 | |
| | `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 | |
| | `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 | |
| | `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 | |
| | `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 | |
| | `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 | |
| | `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 | |
| | `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 | |
| | `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 | |
| | `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 | |
| | `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 | |
| | `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 | |
| | `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 | |
| | `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 | |
| | `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 | |
| | `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 | |
| | `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 | |
| | `security-repair-review-sft-20260319` | `review` | 893 | 159,086 | |
| | `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 | |
| | `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 | |
| | `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 | |
| | `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 | |
| | `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 | |
| | `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 | |
| | `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 | |
| | `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 | |
| | `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 | |
| | `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 | |
| | `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 | |
| | `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 | |
| | `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 | |
| | `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 | |
|
|
| ## Representative Training Examples |
|
|
| These are short excerpts from the actual broad-mix and component datasets used in the first public stream. |
|
|
| ### Broad mix: grounded repo repair |
|
|
| **System** |
| ```text |
| Ground the answer in one active boundary, not in general advice. |
| ``` |
|
|
| **User** |
| ```text |
| First fix failed. Repo: packages/notes-index |
| Patch 1 removed one fallback in `app/config.py`. |
| New evidence: `docker compose: service worker exited with code 1`. |
| Need the second minimal fix. |
| ``` |
|
|
| **Assistant** |
| ```text |
| The first patch hit the wrong seam. The new signal points back to `app/config.py`. |
| ``` |
|
|
| ### Broad mix: Russian ambiguity tightening |
|
|
| **System** |
| ```text |
| Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии. |
| ``` |
|
|
| **User** |
| ```text |
| Оцени формулировку из инструкции: |
| `Разрешён доступ аналитики и владельцы без резервной копией.` |
| ``` |
|
|
| **Assistant** |
| ```text |
| Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй. |
| Строже: `И аналитики, и владельцы получают доступ без резервной копией.` |
| ``` |
|
|
| ### Tool-use source: strict JSON action selection |
|
|
| **System** |
| ```text |
| Ты code-workspace агент. Возвращай только валидный JSON. |
| ``` |
|
|
| **User** |
| ```text |
| Исправь этот сломанный tool call и верни только валидный JSON. |
| {"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}} |
| ``` |
|
|
| **Assistant** |
| ```json |
| {"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}} |
| ``` |
|
|
| ### Docs source: structured README generation |
|
|
| **System** |
| ```text |
| Ты technical writer для engineering workflows. Только полезная структура. |
| ``` |
|
|
| **User** |
| ```text |
| Напиши полноценный README.md для `md-audit`. |
| Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions. |
| ``` |
|
|
| ### Logic source: contradiction checking |
|
|
| **System** |
| ```text |
| Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности. |
| ``` |
|
|
| **User** |
| ```text |
| Проверь набор на непротиворечивость. |
| 1. `Борис` гость -> не может публиковать. |
| 2. `Борис` гость. |
| 3. `Борис` может публиковать. |
| ``` |
|
|
| **Assistant** |
| ```text |
| Нет. Противоречие между выводом из 1+2 и пунктом 3. |
| ``` |
|
|
| ## Training Recipe |
|
|
| The public run used: |
|
|
| - distributed setup: **`torchrun` DDP** |
| - training framework: **Unsloth + TRL** |
| - base model loading: **4-bit** |
| - LoRA rank: **16** |
| - LoRA alpha: **16** |
| - LoRA dropout: **0.0** |
| - max sequence length: **2048** |
| - per-device train batch size: **1** |
| - gradient accumulation steps: **8** |
| - effective global batch size: **16** examples / optimization step |
| - epochs: **1** |
| - optimizer: **`adamw_8bit`** |
| - scheduler: **cosine** |
| - learning rate: **1e-4** |
| - warmup steps: **5** |
| - gradient checkpointing: **enabled** |
| - FP16: **forced** |
| - packing: **disabled** |
| - completion-only loss: **disabled** |
| - public run total steps: **2256** |
| - logging / eval / save cadence: **50 / 125 / 250** |
| |
| ## Prompt Style |
| |
| This adapter was trained on a simple, explicit prompt layout: |
| |
| ```text |
| System: |
| <system prompt> |
| |
| User: |
| <user prompt> |
| |
| Assistant: |
| ``` |
| |
| For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to: |
| |
| - repo repair tasks with concrete evidence |
| - exact wording / logic cleanup tasks |
| - tool-call selection with explicit schemas |
| - technical writing with clear requested sections |
| - review / integration prompts that specify files, symptoms, and expected outcomes |
| |
| ## Quick Start |
| |
| ```python |
| import torch |
| from peft import PeftModel |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| base_id = "Qwen/Qwen3.5-4B-Base" |
| adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora" |
| |
| tokenizer = AutoTokenizer.from_pretrained(adapter_id) |
| base_model = AutoModelForCausalLM.from_pretrained( |
| base_id, |
| torch_dtype="auto", |
| device_map="auto", |
| ) |
| model = PeftModel.from_pretrained(base_model, adapter_id) |
| |
| system = "Return the smallest useful answer. Do not invent missing evidence." |
| user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect." |
| prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n" |
| |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| with torch.no_grad(): |
| out = model.generate(**inputs, max_new_tokens=160) |
| |
| print(tokenizer.decode(out[0], skip_special_tokens=True)) |
| ``` |
| |
| ## Intended Use |
| |
| Graphite 1.0 4B is intended for: |
| |
| - coding assistants |
| - repo triage and patch-planning copilots |
| - Markdown / docs tooling assistants |
| - logic and wording critique |
| - bilingual technical task routing |
| |
| It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose. |
| |
| ## Limitations |
| |
| - This is an **adapter**, not a standalone merged model. |
| - It is tuned for **structured technical work**, not general consumer chat. |
| - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`. |
| - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target. |
| |
| ## License |
| |
| This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE). |
| |
| Please also review the license and usage terms of the base model: |
| |
| - [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) |
| |
| ## Acknowledgements |
| |
| - Alibaba Qwen team for the base model |
| - Unsloth for the efficient LoRA training stack |
| - TRL / Transformers / PEFT / PyTorch maintainers |
| |