---
license: apache-2.0
base_model: Qwen/Qwen3.5-4B-Base
library_name: peft
pipeline_tag: text-generation
model_name: Graphite 1.0 4B
language:
- en
- ru
tags:
- qwen
- qwen3.5
- peft
- lora
- unsloth
- trl
- sft
- code
- reasoning
- bilingual
- obsidian
- graphite
---

# Graphite 1.0 4B

`Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across:

- repo repair and debugging
- agent tool-use formatting
- technical writing and Markdown workflows
- code review and integration tasks
- logic and factual precision
- bilingual Russian / English instruction following

## What This Repository Contains

This repo contains a **LoRA adapter**, not merged base weights.

- Base model: `Qwen/Qwen3.5-4B-Base`
- Adapter type: `LoRA`
- Rank: `r=16`
- Alpha: `16`
- Dropout: `0.0`
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

Files of interest:

- `adapter_model.safetensors`: LoRA weights
- `adapter_config.json`: PEFT adapter config
- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets
- `run_summary.json`: public training run summary
- `length_stats.json`: length filtering summary
- `masking_sanity.json`: formatting sanity check

## Training Lineage

This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.

- dataset family: **`obsidian-critic-broad-mix-20260321`**
- training stack: **Unsloth + TRL + torchrun DDP**
- base model: **`Qwen/Qwen3.5-4B-Base`**

Notebook lineage used for this stream:

- `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix
- `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run

## Dataset Provenance

The training data for this first public stream comes from the mixed dataset:

- dataset name: `obsidian-critic-broad-mix-20260321`
- examples in mixed dataset: `37,008`
- approximate token volume: `6,885,960`
- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
- normalized near-duplicates removed from wave backfill rows: `201`
- dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a`

The public training run then created a deterministic train / validation split and applied sequence-length filtering:

- train rows before filter: `36,638`
- validation rows before filter: `370`
- train rows after filter: `36,081`
- validation rows after filter: `363`
- removed for length filtering: `564`
- minimum kept sequence length: `48`
- maximum kept sequence length: `2048`

### Mix Roles

| Role | Examples | Approx. tokens |
| --- | ---: | ---: |
| `repair` | 5,353 | 983,890 |
| `tool_use` | 4,682 | 455,600 |
| `core_real` | 4,200 | 1,043,187 |
| `robustness` | 3,600 | 397,399 |
| `agent_core` | 3,200 | 645,426 |
| `logic` | 3,031 | 297,812 |
| `factual` | 2,960 | 142,787 |
| `obsidian_docs` | 2,740 | 490,850 |
| `reasoning` | 2,200 | 655,624 |
| `greenfield` | 1,488 | 563,331 |
| `integration` | 1,473 | 241,331 |
| `review` | 1,327 | 343,007 |
| `regularizer` | 500 | 55,840 |
| `wave_backfill` | 230 | 218,922 |
| `long_context` | 24 | 350,954 |

### Source Dataset Table

| Dataset | Role | Examples | Approx. tokens |
| --- | --- | ---: | ---: |
| `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
| `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
| `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
| `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
| `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
| `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
| `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
| `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
| `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
| `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
| `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
| `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
| `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
| `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
| `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
| `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
| `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
| `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
| `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
| `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
| `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
| `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
| `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
| `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
| `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
| `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
| `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
| `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
| `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
| `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
| `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
| `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |

## Representative Training Examples

These are short excerpts from the actual broad-mix and component datasets used in the first public stream.

### Broad mix: grounded repo repair

**System**
```text
Ground the answer in one active boundary, not in general advice.
```

**User**
```text
First fix failed. Repo: packages/notes-index
Patch 1 removed one fallback in `app/config.py`.
New evidence: `docker compose: service worker exited with code 1`.
Need the second minimal fix.
```

**Assistant**
```text
The first patch hit the wrong seam. The new signal points back to `app/config.py`.
```

### Broad mix: Russian ambiguity tightening

**System**
```text
Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.
```

**User**
```text
Оцени формулировку из инструкции:
`Разрешён доступ аналитики и владельцы без резервной копией.`
```

**Assistant**
```text
Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
Строже: `И аналитики, и владельцы получают доступ без резервной копией.`
```

### Tool-use source: strict JSON action selection

**System**
```text
Ты code-workspace агент. Возвращай только валидный JSON.
```

**User**
```text
Исправь этот сломанный tool call и верни только валидный JSON.
{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}
```

**Assistant**
```json
{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}
```

### Docs source: structured README generation

**System**
```text
Ты technical writer для engineering workflows. Только полезная структура.
```

**User**
```text
Напиши полноценный README.md для `md-audit`.
Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.
```

### Logic source: contradiction checking

**System**
```text
Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.
```

**User**
```text
Проверь набор на непротиворечивость.
1. `Борис` гость -> не может публиковать.
2. `Борис` гость.
3. `Борис` может публиковать.
```

**Assistant**
```text
Нет. Противоречие между выводом из 1+2 и пунктом 3.
```

## Training Recipe

The public run used:

- distributed setup: **`torchrun` DDP**
- training framework: **Unsloth + TRL**
- base model loading: **4-bit**
- LoRA rank: **16**
- LoRA alpha: **16**
- LoRA dropout: **0.0**
- max sequence length: **2048**
- per-device train batch size: **1**
- gradient accumulation steps: **8**
- effective global batch size: **16** examples / optimization step
- epochs: **1**
- optimizer: **`adamw_8bit`**
- scheduler: **cosine**
- learning rate: **1e-4**
- warmup steps: **5**
- gradient checkpointing: **enabled**
- FP16: **forced**
- packing: **disabled**
- completion-only loss: **disabled**
- public run total steps: **2256**
- logging / eval / save cadence: **50 / 125 / 250**

## Prompt Style

This adapter was trained on a simple, explicit prompt layout:

```text
System:
<system prompt>

User:
<user prompt>

Assistant:
```

For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:

- repo repair tasks with concrete evidence
- exact wording / logic cleanup tasks
- tool-call selection with explicit schemas
- technical writing with clear requested sections
- review / integration prompts that specify files, symptoms, and expected outcomes

## Quick Start

```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-4B-Base"
adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

system = "Return the smallest useful answer. Do not invent missing evidence."
user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=160)

print(tokenizer.decode(out[0], skip_special_tokens=True))
```

## Intended Use

Graphite 1.0 4B is intended for:

- coding assistants
- repo triage and patch-planning copilots
- Markdown / docs tooling assistants
- logic and wording critique
- bilingual technical task routing

It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose.

## Limitations

- This is an **adapter**, not a standalone merged model.
- It is tuned for **structured technical work**, not general consumer chat.
- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.

## License

This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE).

Please also review the license and usage terms of the base model:

- [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base)

## Acknowledgements

- Alibaba Qwen team for the base model
- Unsloth for the efficient LoRA training stack
- TRL / Transformers / PEFT / PyTorch maintainers