--- license: apache-2.0 base_model: Qwen/Qwen3.5-4B-Base library_name: peft pipeline_tag: text-generation model_name: Graphite 1.0 4B language: - en - ru tags: - qwen - qwen3.5 - peft - lora - unsloth - trl - sft - code - reasoning - bilingual - obsidian - graphite --- # Graphite 1.0 4B `Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across: - repo repair and debugging - agent tool-use formatting - technical writing and Markdown workflows - code review and integration tasks - logic and factual precision - bilingual Russian / English instruction following ## What This Repository Contains This repo contains a **LoRA adapter**, not merged base weights. - Base model: `Qwen/Qwen3.5-4B-Base` - Adapter type: `LoRA` - Rank: `r=16` - Alpha: `16` - Dropout: `0.0` - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` Files of interest: - `adapter_model.safetensors`: LoRA weights - `adapter_config.json`: PEFT adapter config - `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets - `run_summary.json`: public training run summary - `length_stats.json`: length filtering summary - `masking_sanity.json`: formatting sanity check ## Training Lineage This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**. - dataset family: **`obsidian-critic-broad-mix-20260321`** - training stack: **Unsloth + TRL + torchrun DDP** - base model: **`Qwen/Qwen3.5-4B-Base`** Notebook lineage used for this stream: - `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix - `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run ## Dataset Provenance The training data for this first public stream comes from the mixed dataset: - dataset name: `obsidian-critic-broad-mix-20260321` - examples in mixed dataset: `37,008` - approximate token volume: `6,885,960` - exact duplicate `(user, assistant)` pairs removed during mix build: `3,469` - normalized near-duplicates removed from wave backfill rows: `201` - dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a` The public training run then created a deterministic train / validation split and applied sequence-length filtering: - train rows before filter: `36,638` - validation rows before filter: `370` - train rows after filter: `36,081` - validation rows after filter: `363` - removed for length filtering: `564` - minimum kept sequence length: `48` - maximum kept sequence length: `2048` ### Mix Roles | Role | Examples | Approx. tokens | | --- | ---: | ---: | | `repair` | 5,353 | 983,890 | | `tool_use` | 4,682 | 455,600 | | `core_real` | 4,200 | 1,043,187 | | `robustness` | 3,600 | 397,399 | | `agent_core` | 3,200 | 645,426 | | `logic` | 3,031 | 297,812 | | `factual` | 2,960 | 142,787 | | `obsidian_docs` | 2,740 | 490,850 | | `reasoning` | 2,200 | 655,624 | | `greenfield` | 1,488 | 563,331 | | `integration` | 1,473 | 241,331 | | `review` | 1,327 | 343,007 | | `regularizer` | 500 | 55,840 | | `wave_backfill` | 230 | 218,922 | | `long_context` | 24 | 350,954 | ### Source Dataset Table | Dataset | Role | Examples | Approx. tokens | | --- | --- | ---: | ---: | | `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 | | `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 | | `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 | | `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 | | `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 | | `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 | | `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 | | `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 | | `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 | | `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 | | `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 | | `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 | | `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 | | `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 | | `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 | | `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 | | `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 | | `security-repair-review-sft-20260319` | `review` | 893 | 159,086 | | `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 | | `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 | | `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 | | `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 | | `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 | | `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 | | `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 | | `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 | | `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 | | `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 | | `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 | | `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 | | `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 | | `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 | ## Representative Training Examples These are short excerpts from the actual broad-mix and component datasets used in the first public stream. ### Broad mix: grounded repo repair **System** ```text Ground the answer in one active boundary, not in general advice. ``` **User** ```text First fix failed. Repo: packages/notes-index Patch 1 removed one fallback in `app/config.py`. New evidence: `docker compose: service worker exited with code 1`. Need the second minimal fix. ``` **Assistant** ```text The first patch hit the wrong seam. The new signal points back to `app/config.py`. ``` ### Broad mix: Russian ambiguity tightening **System** ```text Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии. ``` **User** ```text Оцени формулировку из инструкции: `Разрешён доступ аналитики и владельцы без резервной копией.` ``` **Assistant** ```text Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй. Строже: `И аналитики, и владельцы получают доступ без резервной копией.` ``` ### Tool-use source: strict JSON action selection **System** ```text Ты code-workspace агент. Возвращай только валидный JSON. ``` **User** ```text Исправь этот сломанный tool call и верни только валидный JSON. {"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}} ``` **Assistant** ```json {"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}} ``` ### Docs source: structured README generation **System** ```text Ты technical writer для engineering workflows. Только полезная структура. ``` **User** ```text Напиши полноценный README.md для `md-audit`. Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions. ``` ### Logic source: contradiction checking **System** ```text Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности. ``` **User** ```text Проверь набор на непротиворечивость. 1. `Борис` гость -> не может публиковать. 2. `Борис` гость. 3. `Борис` может публиковать. ``` **Assistant** ```text Нет. Противоречие между выводом из 1+2 и пунктом 3. ``` ## Training Recipe The public run used: - distributed setup: **`torchrun` DDP** - training framework: **Unsloth + TRL** - base model loading: **4-bit** - LoRA rank: **16** - LoRA alpha: **16** - LoRA dropout: **0.0** - max sequence length: **2048** - per-device train batch size: **1** - gradient accumulation steps: **8** - effective global batch size: **16** examples / optimization step - epochs: **1** - optimizer: **`adamw_8bit`** - scheduler: **cosine** - learning rate: **1e-4** - warmup steps: **5** - gradient checkpointing: **enabled** - FP16: **forced** - packing: **disabled** - completion-only loss: **disabled** - public run total steps: **2256** - logging / eval / save cadence: **50 / 125 / 250** ## Prompt Style This adapter was trained on a simple, explicit prompt layout: ```text System: User: Assistant: ``` For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to: - repo repair tasks with concrete evidence - exact wording / logic cleanup tasks - tool-call selection with explicit schemas - technical writing with clear requested sections - review / integration prompts that specify files, symptoms, and expected outcomes ## Quick Start ```python import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base_id = "Qwen/Qwen3.5-4B-Base" adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora" tokenizer = AutoTokenizer.from_pretrained(adapter_id) base_model = AutoModelForCausalLM.from_pretrained( base_id, torch_dtype="auto", device_map="auto", ) model = PeftModel.from_pretrained(base_model, adapter_id) system = "Return the smallest useful answer. Do not invent missing evidence." user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect." prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): out = model.generate(**inputs, max_new_tokens=160) print(tokenizer.decode(out[0], skip_special_tokens=True)) ``` ## Intended Use Graphite 1.0 4B is intended for: - coding assistants - repo triage and patch-planning copilots - Markdown / docs tooling assistants - logic and wording critique - bilingual technical task routing It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose. ## Limitations - This is an **adapter**, not a standalone merged model. - It is tuned for **structured technical work**, not general consumer chat. - It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`. - The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target. ## License This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE). Please also review the license and usage terms of the base model: - [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) ## Acknowledgements - Alibaba Qwen team for the base model - Unsloth for the efficient LoRA training stack - TRL / Transformers / PEFT / PyTorch maintainers