Graphite 1.0 4B

Graphite 1.0 4B is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of Qwen/Qwen3.5-4B-Base and tuned for strict, grounded, low-noise responses across:

repo repair and debugging
agent tool-use formatting
technical writing and Markdown workflows
code review and integration tasks
logic and factual precision
bilingual Russian / English instruction following

What This Repository Contains

This repo contains a LoRA adapter, not merged base weights.

Base model: Qwen/Qwen3.5-4B-Base
Adapter type: LoRA
Rank: r=16
Alpha: 16
Dropout: 0.0
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Files of interest:

adapter_model.safetensors: LoRA weights
adapter_config.json: PEFT adapter config
tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer assets
run_summary.json: public training run summary
length_stats.json: length filtering summary
masking_sanity.json: formatting sanity check

Training Lineage

This adapter corresponds to the first public Graphite 1.0 4B full fine-tune stream.

dataset family: obsidian-critic-broad-mix-20260321
training stack: Unsloth + TRL + torchrun DDP
base model: Qwen/Qwen3.5-4B-Base

Notebook lineage used for this stream:

obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb: smoke-test notebook for the broad mix
obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb: full fine-tune lineage used to produce the public LoRA run

Dataset Provenance

The training data for this first public stream comes from the mixed dataset:

dataset name: obsidian-critic-broad-mix-20260321
examples in mixed dataset: 37,008
approximate token volume: 6,885,960
exact duplicate (user, assistant) pairs removed during mix build: 3,469
normalized near-duplicates removed from wave backfill rows: 201
dataset SHA-256: 5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a

The public training run then created a deterministic train / validation split and applied sequence-length filtering:

train rows before filter: 36,638
validation rows before filter: 370
train rows after filter: 36,081
validation rows after filter: 363
removed for length filtering: 564
minimum kept sequence length: 48
maximum kept sequence length: 2048

Mix Roles

Role	Examples	Approx. tokens
`repair`	5,353	983,890
`tool_use`	4,682	455,600
`core_real`	4,200	1,043,187
`robustness`	3,600	397,399
`agent_core`	3,200	645,426
`logic`	3,031	297,812
`factual`	2,960	142,787
`obsidian_docs`	2,740	490,850
`reasoning`	2,200	655,624
`greenfield`	1,488	563,331
`integration`	1,473	241,331
`review`	1,327	343,007
`regularizer`	500	55,840
`wave_backfill`	230	218,922
`long_context`	24	350,954

Source Dataset Table

Dataset	Role	Examples	Approx. tokens
`real-world-grounded-topup-sft-20260320`	`core_real`	3,000	804,280
`robustness-noise-traps-sft-20260320`	`robustness`	3,200	363,139
`factual-erudition-sft-20260319`	`factual`	2,960	142,787
`agent-gap-fixes-sft-20260320`	`agent_core`	2,500	555,785
`code-fix-critical-topup-sft-20260321`	`repair`	2,500	440,355
`code-agent-tooluse-sft-20260319`	`tool_use`	2,400	208,335
`docs-engineering-review-topup-sft-20260320`	`obsidian_docs`	1,600	206,794
`format-tool-discipline-sft-20260319`	`tool_use`	1,582	179,009
`multi-step-debug-sft-20260319`	`reasoning`	1,200	345,084
`real-world-seed-expansion-sft-20260321`	`core_real`	1,200	238,907
`runtime-debug-grounded-sft-20260319`	`repair`	1,193	218,845
`logic-core-sft-20260319`	`logic`	1,131	139,188
`code-architecture-sft-20260319`	`greenfield`	1,100	308,380
`tdd-test-first-sft-20260319`	`reasoning`	1,000	310,540
`logic-sanity-sft-20260319`	`logic`	996	69,260
`code-repair-patch-sft-20260319`	`repair`	955	145,544
`logic-precision-ru-sft-20260319`	`logic`	904	89,364
`security-repair-review-sft-20260319`	`review`	893	159,086
`db-and-migrations-sft-20260319`	`integration`	867	119,740
`agent-gap-fixes-ru-topup-sft-20260320`	`agent_core`	700	89,641
`code-agent-tooluse-ru-topup-sft-20260320`	`tool_use`	700	68,256
`multi-file-repo-repair-sft-20260319`	`repair`	705	179,146
`backend-frontend-ops-sft-20260319`	`integration`	606	121,591
`docs-topup-sft-20260320`	`obsidian_docs`	600	125,125
`anti-overthinking-pack-sft-20260321`	`regularizer`	500	55,840
`docs-markdown-sft-20260318-v3`	`obsidian_docs`	440	149,852
`ts-rust-code-review-sft-20260318-v3`	`review`	434	183,921
`robustness-noise-traps-ru-topup-sft-20260320`	`robustness`	400	34,260
`ts-rust-coding-sft-20260318-v3`	`greenfield`	388	254,951
`wave-03-growth-sft-20260320`	`wave_backfill`	230	218,922
`docs-topup-ru-sft-20260320`	`obsidian_docs`	100	9,079
`long-context-memory-topup-sft-20260321`	`long_context`	24	350,954

Representative Training Examples

These are short excerpts from the actual broad-mix and component datasets used in the first public stream.

Broad mix: grounded repo repair

System

Ground the answer in one active boundary, not in general advice.

User

First fix failed. Repo: packages/notes-index
Patch 1 removed one fallback in `app/config.py`.
New evidence: `docker compose: service worker exited with code 1`.
Need the second minimal fix.

Assistant

The first patch hit the wrong seam. The new signal points back to `app/config.py`.

Broad mix: Russian ambiguity tightening

System

Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.

User

Оцени формулировку из инструкции:
`Разрешён доступ аналитики и владельцы без резервной копией.`

Assistant

Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
Строже: `И аналитики, и владельцы получают доступ без резервной копией.`

Tool-use source: strict JSON action selection

System

Ты code-workspace агент. Возвращай только валидный JSON.

User

Исправь этот сломанный tool call и верни только валидный JSON.
{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}

Assistant

{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}

Docs source: structured README generation

System

Ты technical writer для engineering workflows. Только полезная структура.

User

Напиши полноценный README.md для `md-audit`.
Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.

Logic source: contradiction checking

System

Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.

User

Проверь набор на непротиворечивость.
1. `Борис` гость -> не может публиковать.
2. `Борис` гость.
3. `Борис` может публиковать.

Assistant

Нет. Противоречие между выводом из 1+2 и пунктом 3.

Training Recipe

The public run used:

distributed setup: torchrun DDP
training framework: Unsloth + TRL
base model loading: 4-bit
LoRA rank: 16
LoRA alpha: 16
LoRA dropout: 0.0
max sequence length: 2048
per-device train batch size: 1
gradient accumulation steps: 8
effective global batch size: 16 examples / optimization step
epochs: 1
optimizer: adamw_8bit
scheduler: cosine
learning rate: 1e-4
warmup steps: 5
gradient checkpointing: enabled
FP16: forced
packing: disabled
completion-only loss: disabled
public run total steps: 2256
logging / eval / save cadence: 50 / 125 / 250

Prompt Style

This adapter was trained on a simple, explicit prompt layout:

System:
<system prompt>

User:
<user prompt>

Assistant:

For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:

repo repair tasks with concrete evidence
exact wording / logic cleanup tasks
tool-call selection with explicit schemas
technical writing with clear requested sections
review / integration prompts that specify files, symptoms, and expected outcomes

Quick Start

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-4B-Base"
adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

system = "Return the smallest useful answer. Do not invent missing evidence."
user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=160)

print(tokenizer.decode(out[0], skip_special_tokens=True))

Intended Use

Graphite 1.0 4B is intended for:

coding assistants
repo triage and patch-planning copilots
Markdown / docs tooling assistants
logic and wording critique
bilingual technical task routing

It is especially useful when you want short, grounded, non-theatrical outputs instead of generic assistant prose.

Limitations

This is an adapter, not a standalone merged model.
It is tuned for structured technical work, not general consumer chat.
It inherits both strengths and weaknesses from Qwen/Qwen3.5-4B-Base.
The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.

License

This repository is released under Apache License 2.0. See LICENSE.

Please also review the license and usage terms of the base model:

Qwen/Qwen3.5-4B-Base

Acknowledgements

Alibaba Qwen team for the base model
Unsloth for the efficient LoRA training stack
TRL / Transformers / PEFT / PyTorch maintainers

Downloads last month: 1

Model tree for OpenReasonAI/Graphite1.0-4B

Base model

Qwen/Qwen3.5-4B-Base

Adapter

(2)

this model