Rewrite model card for Graphite 1.0 4B and add Apache-2.0 license

Files changed (2) hide show

LICENSE +162 -0
README.md +330 -37

LICENSE ADDED Viewed

	@@ -0,0 +1,162 @@

+Apache License
+Version 2.0, January 2004
+http://www.apache.org/licenses/
+TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+1. Definitions.
+"License" shall mean the terms and conditions for use, reproduction, and
+distribution as defined by Sections 1 through 9 of this document.
+"Licensor" shall mean the copyright owner or entity authorized by the
+copyright owner that is granting the License.
+"Legal Entity" shall mean the union of the acting entity and all other
+entities that control, are controlled by, or are under common control with
+that entity. For the purposes of this definition, "control" means (i) the
+power, direct or indirect, to cause the direction or management of such
+entity, whether by contract or otherwise, or (ii) ownership of fifty percent
+(50%) or more of the outstanding shares, or (iii) beneficial ownership of
+such entity.
+"You" (or "Your") shall mean an individual or Legal Entity exercising
+permissions granted by this License.
+"Source" form shall mean the preferred form for making modifications,
+including but not limited to software source code, documentation source, and
+configuration files.
+"Object" form shall mean any form resulting from mechanical transformation or
+translation of a Source form, including but not limited to compiled object
+code, generated documentation, and conversions to other media types.
+"Work" shall mean the work of authorship, whether in Source or Object form,
+made available under the License, as indicated by a copyright notice that is
+included in or attached to the work (an example is provided in the Appendix
+below).
+"Derivative Works" shall mean any work, whether in Source or Object form,
+that is based on (or derived from) the Work and for which the editorial
+revisions, annotations, elaborations, or other modifications represent, as a
+whole, an original work of authorship. For the purposes of this License,
+Derivative Works shall not include works that remain separable from, or
+merely link (or bind by name) to the interfaces of, the Work and Derivative
+Works thereof.
+"Contribution" shall mean any work of authorship, including the original
+version of the Work and any modifications or additions to that Work or
+Derivative Works thereof, that is intentionally submitted to Licensor for
+inclusion in the Work by the copyright owner or by an individual or Legal
+Entity authorized to submit on behalf of the copyright owner. For the
+purposes of this definition, "submitted" means any form of electronic, verbal,
+or written communication sent to the Licensor or its representatives,
+including but not limited to communication on electronic mailing lists,
+source code control systems, and issue tracking systems that are managed by,
+or on behalf of, the Licensor for the purpose of discussing and improving the
+Work, but excluding communication that is conspicuously marked or otherwise
+designated in writing by the copyright owner as "Not a Contribution."
+"Contributor" shall mean Licensor and any individual or Legal Entity on
+behalf of whom a Contribution has been received by Licensor and subsequently
+incorporated within the Work.
+2. Grant of Copyright License. Subject to the terms and conditions of this
+License, each Contributor hereby grants to You a perpetual, worldwide,
+non-exclusive, no-charge, royalty-free, irrevocable copyright license to
+reproduce, prepare Derivative Works of, publicly display, publicly perform,
+sublicense, and distribute the Work and such Derivative Works in Source or
+Object form.
+3. Grant of Patent License. Subject to the terms and conditions of this
+License, each Contributor hereby grants to You a perpetual, worldwide,
+non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this
+section) patent license to make, have made, use, offer to sell, sell, import,
+and otherwise transfer the Work, where such license applies only to those
+patent claims licensable by such Contributor that are necessarily infringed
+by their Contribution(s) alone or by combination of their Contribution(s)
+with the Work to which such Contribution(s) was submitted. If You institute
+patent litigation against any entity (including a cross-claim or counterclaim
+in a lawsuit) alleging that the Work or a Contribution incorporated within
+the Work constitutes direct or contributory patent infringement, then any
+patent licenses granted to You under this License for that Work shall
+terminate as of the date such litigation is filed.
+4. Redistribution. You may reproduce and distribute copies of the Work or
+Derivative Works thereof in any medium, with or without modifications, and in
+Source or Object form, provided that You meet the following conditions:
+(a) You must give any other recipients of the Work or Derivative Works a copy
+of this License; and
+(b) You must cause any modified files to carry prominent notices stating that
+You changed the files; and
+(c) You must retain, in the Source form of any Derivative Works that You
+distribute, all copyright, patent, trademark, and attribution notices from
+the Source form of the Work, excluding those notices that do not pertain to
+any part of the Derivative Works; and
+(d) If the Work includes a "NOTICE" text file as part of its distribution,
+then any Derivative Works that You distribute must include a readable copy of
+the attribution notices contained within such NOTICE file, excluding those
+notices that do not pertain to any part of the Derivative Works, in at least
+one of the following places: within a NOTICE text file distributed as part of
+the Derivative Works; within the Source form or documentation, if provided
+along with the Derivative Works; or, within a display generated by the
+Derivative Works, if and wherever such third-party notices normally appear.
+The contents of the NOTICE file are for informational purposes only and do
+not modify the License. You may add Your own attribution notices within
+Derivative Works that You distribute, alongside or as an addendum to the
+NOTICE text from the Work, provided that such additional attribution notices
+cannot be construed as modifying the License.
+You may add Your own copyright statement to Your modifications and may provide
+additional or different license terms and conditions for use, reproduction,
+or distribution of Your modifications, or for any such Derivative Works as a
+whole, provided Your use, reproduction, and distribution of the Work otherwise
+complies with the conditions stated in this License.
+5. Submission of Contributions. Unless You explicitly state otherwise, any
+Contribution intentionally submitted for inclusion in the Work by You to the
+Licensor shall be under the terms and conditions of this License, without any
+additional terms or conditions. Notwithstanding the above, nothing herein
+shall supersede or modify the terms of any separate license agreement you may
+have executed with Licensor regarding such Contributions.
+6. Trademarks. This License does not grant permission to use the trade names,
+trademarks, service marks, or product names of the Licensor, except as
+required for reasonable and customary use in describing the origin of the
+Work and reproducing the content of the NOTICE file.
+7. Disclaimer of Warranty. Unless required by applicable law or agreed to in
+writing, Licensor provides the Work (and each Contributor provides its
+Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied, including, without limitation, any warranties
+or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+PARTICULAR PURPOSE. You are solely responsible for determining the
+appropriateness of using or redistributing the Work and assume any risks
+associated with Your exercise of permissions under this License.
+8. Limitation of Liability. In no event and under no legal theory, whether in
+tort (including negligence), contract, or otherwise, unless required by
+applicable law (such as deliberate and grossly negligent acts) or agreed to
+in writing, shall any Contributor be liable to You for damages, including any
+direct, indirect, special, incidental, or consequential damages of any
+character arising as a result of this License or out of the use or inability
+to use the Work (including but not limited to damages for loss of goodwill,
+work stoppage, computer failure or malfunction, or any and all other
+commercial damages or losses), even if such Contributor has been advised of
+the possibility of such damages.
+9. Accepting Warranty or Additional Liability. While redistributing the Work
+or Derivative Works thereof, You may choose to offer, and charge a fee for,
+acceptance of support, warranty, indemnity, or other liability obligations
+and/or rights consistent with this License. However, in accepting such
+obligations, You may act only on Your own behalf and on Your sole
+responsibility, not on behalf of any other Contributor, and only if You agree
+to indemnify, defend, and hold each Contributor harmless for any liability
+incurred by, or claims asserted against, such Contributor by reason of your
+accepting any such warranty or additional liability.
+END OF TERMS AND CONDITIONS

README.md CHANGED Viewed

@@ -1,59 +1,352 @@
 ---
 base_model: Qwen/Qwen3.5-4B-Base
-library_name: transformers
-model_name: obsidian-critic-qwen35-4b-base-lora
 tags:
-- generated_from_trainer
-- sft
-- trl
 - unsloth
-licence: license
 ---
-# Model Card for obsidian-critic-qwen35-4b-base-lora
-This model is a fine-tuned version of [Qwen/Qwen3.5-4B-Base](https://huggingface.co/Qwen/Qwen3.5-4B-Base).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
 ```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="Starred09/obsidian-critic-qwen35-4b-base-lora", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
 ```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 0.24.0
-- Transformers: 5.3.0
-- Pytorch: 2.10.0+cu128
-- Datasets: 4.3.0
-- Tokenizers: 0.22.2
-## Citations
-Cite TRL as:
-```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
-}
-```

 ---
+license: apache-2.0
 base_model: Qwen/Qwen3.5-4B-Base
+library_name: peft
+pipeline_tag: text-generation
+model_name: Graphite 1.0 4B
+language:
+- en
+- ru
 tags:
+- qwen
+- qwen3.5
+- peft
+- lora
 - unsloth
+- trl
+- sft
+- code
+- reasoning
+- bilingual
+- obsidian
+- graphite
 ---
+# Graphite 1.0 4B
+`Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across:
+- repo repair and debugging
+- agent tool-use formatting
+- technical writing and Markdown workflows
+- code review and integration tasks
+- logic and factual precision
+- bilingual Russian / English instruction following
+This repository keeps the legacy slug `obsidian-critic-qwen35-4b-base-lora` because that was the original public upload target, but the public model name for documentation and grant material is **Graphite 1.0 4B**.
+## What This Repository Contains
+This repo contains a **LoRA adapter**, not merged base weights.
+- Base model: `Qwen/Qwen3.5-4B-Base`
+- Adapter type: `LoRA`
+- Rank: `r=16`
+- Alpha: `16`
+- Dropout: `0.0`
+- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+Files of interest:
+- `adapter_model.safetensors`: LoRA weights
+- `adapter_config.json`: PEFT adapter config
+- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets
+- `run_summary.json`: public training run summary
+- `length_stats.json`: length filtering summary
+- `masking_sanity.json`: formatting sanity check
+## Training Lineage
+This adapter corresponds to the **first public Kaggle 2xT4 full fine-tune stream** before the later `Graphite 1.1` reweight experiment. For provenance purposes:
+- the original dataset family is **`obsidian-critic-broad-mix-20260321`**
+- the training stack is **Unsloth + TRL + torchrun DDP on dual T4**
+- the public upload target for this run was this repo
+- later `Graphite 1.1` experiments are intentionally excluded from this card
+Notebook lineage used for this stream:
+- `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix
+- `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run
+## Dataset Provenance
+The training data for this first public stream comes from the mixed dataset:
+- dataset name: `obsidian-critic-broad-mix-20260321`
+- local source dir: `/home/starred/datasets/obsidian-critic-broad-mix-20260321`
+- examples in mixed dataset: `37,008`
+- approximate token volume: `6,885,960`
+- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
+- normalized near-duplicates removed from wave backfill rows: `201`
+- dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a`
+The public training run then created a deterministic train / validation split and applied sequence-length filtering:
+- train rows before filter: `36,638`
+- validation rows before filter: `370`
+- train rows after filter: `36,081`
+- validation rows after filter: `363`
+- removed for length filtering: `564`
+- minimum kept sequence length: `48`
+- maximum kept sequence length: `2048`
+### Mix Roles
+| Role | Examples | Approx. tokens |
+| --- | ---: | ---: |
+| `repair` | 5,353 | 983,890 |
+| `tool_use` | 4,682 | 455,600 |
+| `core_real` | 4,200 | 1,043,187 |
+| `robustness` | 3,600 | 397,399 |
+| `agent_core` | 3,200 | 645,426 |
+| `logic` | 3,031 | 297,812 |
+| `factual` | 2,960 | 142,787 |
+| `obsidian_docs` | 2,740 | 490,850 |
+| `reasoning` | 2,200 | 655,624 |
+| `greenfield` | 1,488 | 563,331 |
+| `integration` | 1,473 | 241,331 |
+| `review` | 1,327 | 343,007 |
+| `regularizer` | 500 | 55,840 |
+| `wave_backfill` | 230 | 218,922 |
+| `long_context` | 24 | 350,954 |
+### Source Datasets Included In The Broad Mix
+- **`obsidian_docs`**: `docs-markdown-sft-20260318-v3`, `docs-engineering-review-topup-sft-20260320`, `docs-topup-sft-20260320`, `docs-topup-ru-sft-20260320`
+- **`tool_use`**: `format-tool-discipline-sft-20260319`, `code-agent-tooluse-sft-20260319`, `code-agent-tooluse-ru-topup-sft-20260320`
+- **`greenfield`**: `code-architecture-sft-20260319`, `ts-rust-coding-sft-20260318-v3`
+- **`repair`**: `runtime-debug-grounded-sft-20260319`, `multi-file-repo-repair-sft-20260319`, `code-repair-patch-sft-20260319`, `code-fix-critical-topup-sft-20260321`
+- **`review`**: `security-repair-review-sft-20260319`, `ts-rust-code-review-sft-20260318-v3`
+- **`integration`**: `db-and-migrations-sft-20260319`, `backend-frontend-ops-sft-20260319`
+- **`reasoning`**: `tdd-test-first-sft-20260319`, `multi-step-debug-sft-20260319`
+- **`agent_core`**: `agent-gap-fixes-sft-20260320`, `agent-gap-fixes-ru-topup-sft-20260320`
+- **`robustness`**: `robustness-noise-traps-sft-20260320`, `robustness-noise-traps-ru-topup-sft-20260320`
+- **`core_real`**: `real-world-grounded-topup-sft-20260320`, `real-world-seed-expansion-sft-20260321`
+- **`long_context`**: `long-context-memory-topup-sft-20260321`
+- **`regularizer`**: `anti-overthinking-pack-sft-20260321`
+- **`logic`**: `logic-core-sft-20260319`, `logic-sanity-sft-20260319`, `logic-precision-ru-sft-20260319`
+- **`factual`**: `factual-erudition-sft-20260319`
+- **`wave_backfill`**: `wave-01-growth-sft-20260319`, `wave-02-growth-sft-20260319`, `wave-03-growth-sft-20260320`
+Explicitly excluded from the mix build:
+- `anti-regression-eval-20260319`
+- `curated-code-train-mix-20260320`
+- `graphite-1.0-code-train-mix-20260321`
+- `css-ui-premium-sft-20260319`
+- `css-style-premium-sft-20260320`
+- `css-style-premium-ru-topup-sft-20260320`
+- `css-style-sft-20260318`
+## Representative Training Examples
+These are short excerpts from the actual broad-mix and component datasets used in the first public stream.
+### Broad mix: grounded repo repair
+**System**
+```text
+Ground the answer in one active boundary, not in general advice.
+```
+**User**
+```text
+First fix failed. Repo: packages/notes-index
+Patch 1 removed one fallback in `app/config.py`.
+New evidence: `docker compose: service worker exited with code 1`.
+Need the second minimal fix.
+```
+**Assistant**
+```text
+The first patch hit the wrong seam. The new signal points back to `app/config.py`.
+```
+### Broad mix: Russian ambiguity tightening
+**System**
+```text
+Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.
+```
+**User**
+```text
+Оцени формулировку из инструкции:
+`Разрешён доступ аналитики и владельцы без резервной копией.`
+```
+**Assistant**
+```text
+Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
+Строже: `И аналитики, и владельцы получают доступ без резервной копией.`
+```
+### Tool-use source: strict JSON action selection
+**System**
+```text
+Ты code-workspace агент. Возвращай только валидный JSON.
+```
+**User**
+```text
+Исправь этот сломанный tool call и верни только валидный JSON.
+{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}
+```
+**Assistant**
+```json
+{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}
+```
+### Docs source: structured README generation
+**System**
+```text
+Ты technical writer для engineering workflows. Только полезная структура.
+```
+**User**
+```text
+Напиши полноценный README.md для `md-audit`.
+Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.
+```
+### Logic source: contradiction checking
+**System**
+```text
+Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.
+```
+**User**
+```text
+Проверь набор на непротиворечивость.
+1. `Борис` гость -> не может публиковать.
+2. `Борис` гость.
+3. `Борис` может публиковать.
+```
+**Assistant**
+```text
+Нет. Противоречие между выводом из 1+2 и пунктом 3.
+```
+## Training Recipe
+The public run in this repo used:
+- hardware: **Kaggle dual T4**
+- distributed setup: **`torchrun` DDP**
+- training framework: **Unsloth + TRL**
+- base model loading: **4-bit**
+- LoRA rank: **16**
+- LoRA alpha: **16**
+- LoRA dropout: **0.0**
+- max sequence length: **2048**
+- per-device train batch size: **1**
+- gradient accumulation steps: **8**
+- effective global batch size: **16** examples / optimization step
+- epochs: **1**
+- optimizer: **`adamw_8bit`**
+- scheduler: **cosine**
+- learning rate: **1e-4**
+- warmup steps: **5**
+- gradient checkpointing: **enabled**
+- FP16: **forced**
+- packing: **disabled**
+- completion-only loss: **disabled**
+- public run total steps: **2256**
+- logging / eval / save cadence: **50 / 125 / 250**
+Best public checkpoint recorded in `trainer_state.json`:
+- best checkpoint: `checkpoint-2250`
+- best metric: `0.18876151740550995`
+## Prompt Style
+This adapter was trained on a simple, explicit prompt layout:
+```text
+System:
+<system prompt>
+User:
+<user prompt>
+Assistant:
+```
+For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:
+- repo repair tasks with concrete evidence
+- exact wording / logic cleanup tasks
+- tool-call selection with explicit schemas
+- technical writing with clear requested sections
+- review / integration prompts that specify files, symptoms, and expected outcomes
+## Quick Start
 ```python
+import torch
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+base_id = "Qwen/Qwen3.5-4B-Base"
+adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"
+tokenizer = AutoTokenizer.from_pretrained(adapter_id)
+base_model = AutoModelForCausalLM.from_pretrained(
+    base_id,
+    torch_dtype="auto",
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(base_model, adapter_id)
+system = "Return the smallest useful answer. Do not invent missing evidence."
+user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
+prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    out = model.generate(**inputs, max_new_tokens=160)
+print(tokenizer.decode(out[0], skip_special_tokens=True))
 ```
+## Intended Use
+Graphite 1.0 4B is intended for:
+- local and server-side coding assistants
+- repo triage and patch-planning copilots
+- Markdown / docs tooling assistants
+- logic and wording critique
+- bilingual technical task routing
+It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose.
+## Limitations
+- This is an **adapter**, not a standalone merged model.
+- It is tuned for **structured technical work**, not general consumer chat.
+- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
+- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
+- This card documents the **first public stream only**. Later `Graphite 1.1` experiments are intentionally excluded.
+## License
+This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE).
+Please also review the license and usage terms of the base model:
+- [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base)
+## Acknowledgements
+- Alibaba Qwen team for the base model
+- Unsloth for the efficient LoRA training stack
+- TRL / Transformers / PEFT / PyTorch maintainers
+- Kaggle dual-T4 environment used for the public training run