Graphite 1.0 4B

Graphite 1.0 4B is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of Qwen/Qwen3.5-4B-Base and tuned for strict, grounded, low-noise responses across:

  • repo repair and debugging
  • agent tool-use formatting
  • technical writing and Markdown workflows
  • code review and integration tasks
  • logic and factual precision
  • bilingual Russian / English instruction following

What This Repository Contains

This repo contains a LoRA adapter, not merged base weights.

  • Base model: Qwen/Qwen3.5-4B-Base
  • Adapter type: LoRA
  • Rank: r=16
  • Alpha: 16
  • Dropout: 0.0
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Files of interest:

  • adapter_model.safetensors: LoRA weights
  • adapter_config.json: PEFT adapter config
  • tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer assets
  • run_summary.json: public training run summary
  • length_stats.json: length filtering summary
  • masking_sanity.json: formatting sanity check

Training Lineage

This adapter corresponds to the first public Graphite 1.0 4B full fine-tune stream.

  • dataset family: obsidian-critic-broad-mix-20260321
  • training stack: Unsloth + TRL + torchrun DDP
  • base model: Qwen/Qwen3.5-4B-Base

Notebook lineage used for this stream:

  • obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb: smoke-test notebook for the broad mix
  • obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb: full fine-tune lineage used to produce the public LoRA run

Dataset Provenance

The training data for this first public stream comes from the mixed dataset:

  • dataset name: obsidian-critic-broad-mix-20260321
  • examples in mixed dataset: 37,008
  • approximate token volume: 6,885,960
  • exact duplicate (user, assistant) pairs removed during mix build: 3,469
  • normalized near-duplicates removed from wave backfill rows: 201
  • dataset SHA-256: 5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a

The public training run then created a deterministic train / validation split and applied sequence-length filtering:

  • train rows before filter: 36,638
  • validation rows before filter: 370
  • train rows after filter: 36,081
  • validation rows after filter: 363
  • removed for length filtering: 564
  • minimum kept sequence length: 48
  • maximum kept sequence length: 2048

Mix Roles

Role Examples Approx. tokens
repair 5,353 983,890
tool_use 4,682 455,600
core_real 4,200 1,043,187
robustness 3,600 397,399
agent_core 3,200 645,426
logic 3,031 297,812
factual 2,960 142,787
obsidian_docs 2,740 490,850
reasoning 2,200 655,624
greenfield 1,488 563,331
integration 1,473 241,331
review 1,327 343,007
regularizer 500 55,840
wave_backfill 230 218,922
long_context 24 350,954

Source Dataset Table

Dataset Role Examples Approx. tokens
real-world-grounded-topup-sft-20260320 core_real 3,000 804,280
robustness-noise-traps-sft-20260320 robustness 3,200 363,139
factual-erudition-sft-20260319 factual 2,960 142,787
agent-gap-fixes-sft-20260320 agent_core 2,500 555,785
code-fix-critical-topup-sft-20260321 repair 2,500 440,355
code-agent-tooluse-sft-20260319 tool_use 2,400 208,335
docs-engineering-review-topup-sft-20260320 obsidian_docs 1,600 206,794
format-tool-discipline-sft-20260319 tool_use 1,582 179,009
multi-step-debug-sft-20260319 reasoning 1,200 345,084
real-world-seed-expansion-sft-20260321 core_real 1,200 238,907
runtime-debug-grounded-sft-20260319 repair 1,193 218,845
logic-core-sft-20260319 logic 1,131 139,188
code-architecture-sft-20260319 greenfield 1,100 308,380
tdd-test-first-sft-20260319 reasoning 1,000 310,540
logic-sanity-sft-20260319 logic 996 69,260
code-repair-patch-sft-20260319 repair 955 145,544
logic-precision-ru-sft-20260319 logic 904 89,364
security-repair-review-sft-20260319 review 893 159,086
db-and-migrations-sft-20260319 integration 867 119,740
agent-gap-fixes-ru-topup-sft-20260320 agent_core 700 89,641
code-agent-tooluse-ru-topup-sft-20260320 tool_use 700 68,256
multi-file-repo-repair-sft-20260319 repair 705 179,146
backend-frontend-ops-sft-20260319 integration 606 121,591
docs-topup-sft-20260320 obsidian_docs 600 125,125
anti-overthinking-pack-sft-20260321 regularizer 500 55,840
docs-markdown-sft-20260318-v3 obsidian_docs 440 149,852
ts-rust-code-review-sft-20260318-v3 review 434 183,921
robustness-noise-traps-ru-topup-sft-20260320 robustness 400 34,260
ts-rust-coding-sft-20260318-v3 greenfield 388 254,951
wave-03-growth-sft-20260320 wave_backfill 230 218,922
docs-topup-ru-sft-20260320 obsidian_docs 100 9,079
long-context-memory-topup-sft-20260321 long_context 24 350,954

Representative Training Examples

These are short excerpts from the actual broad-mix and component datasets used in the first public stream.

Broad mix: grounded repo repair

System

Ground the answer in one active boundary, not in general advice.

User

First fix failed. Repo: packages/notes-index
Patch 1 removed one fallback in `app/config.py`.
New evidence: `docker compose: service worker exited with code 1`.
Need the second minimal fix.

Assistant

The first patch hit the wrong seam. The new signal points back to `app/config.py`.

Broad mix: Russian ambiguity tightening

System

Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.

User

Оцени формулировку из инструкции:
`Разрешён доступ аналитики и владельцы без резервной копией.`

Assistant

Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
Строже: `И аналитики, и владельцы получают доступ без резервной копией.`

Tool-use source: strict JSON action selection

System

Ты code-workspace агент. Возвращай только валидный JSON.

User

Исправь этот сломанный tool call и верни только валидный JSON.
{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}

Assistant

{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}

Docs source: structured README generation

System

Ты technical writer для engineering workflows. Только полезная структура.

User

Напиши полноценный README.md для `md-audit`.
Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.

Logic source: contradiction checking

System

Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.

User

Проверь набор на непротиворечивость.
1. `Борис` гость -> не может публиковать.
2. `Борис` гость.
3. `Борис` может публиковать.

Assistant

Нет. Противоречие между выводом из 1+2 и пунктом 3.

Training Recipe

The public run used:

  • distributed setup: torchrun DDP
  • training framework: Unsloth + TRL
  • base model loading: 4-bit
  • LoRA rank: 16
  • LoRA alpha: 16
  • LoRA dropout: 0.0
  • max sequence length: 2048
  • per-device train batch size: 1
  • gradient accumulation steps: 8
  • effective global batch size: 16 examples / optimization step
  • epochs: 1
  • optimizer: adamw_8bit
  • scheduler: cosine
  • learning rate: 1e-4
  • warmup steps: 5
  • gradient checkpointing: enabled
  • FP16: forced
  • packing: disabled
  • completion-only loss: disabled
  • public run total steps: 2256
  • logging / eval / save cadence: 50 / 125 / 250

Prompt Style

This adapter was trained on a simple, explicit prompt layout:

System:
<system prompt>

User:
<user prompt>

Assistant:

For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:

  • repo repair tasks with concrete evidence
  • exact wording / logic cleanup tasks
  • tool-call selection with explicit schemas
  • technical writing with clear requested sections
  • review / integration prompts that specify files, symptoms, and expected outcomes

Quick Start

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-4B-Base"
adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

system = "Return the smallest useful answer. Do not invent missing evidence."
user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=160)

print(tokenizer.decode(out[0], skip_special_tokens=True))

Intended Use

Graphite 1.0 4B is intended for:

  • coding assistants
  • repo triage and patch-planning copilots
  • Markdown / docs tooling assistants
  • logic and wording critique
  • bilingual technical task routing

It is especially useful when you want short, grounded, non-theatrical outputs instead of generic assistant prose.

Limitations

  • This is an adapter, not a standalone merged model.
  • It is tuned for structured technical work, not general consumer chat.
  • It inherits both strengths and weaknesses from Qwen/Qwen3.5-4B-Base.
  • The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.

License

This repository is released under Apache License 2.0. See LICENSE.

Please also review the license and usage terms of the base model:

Acknowledgements

  • Alibaba Qwen team for the base model
  • Unsloth for the efficient LoRA training stack
  • TRL / Transformers / PEFT / PyTorch maintainers
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OpenReasonAI/Graphite1.0-4B

Adapter
(2)
this model