Instructions to use dancinlab/hexa-forge-code-7b-qwen2.5-lora-r64-v0.4.0-delegate with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use dancinlab/hexa-forge-code-7b-qwen2.5-lora-r64-v0.4.0-delegate with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B") model = PeftModel.from_pretrained(base_model, "dancinlab/hexa-forge-code-7b-qwen2.5-lora-r64-v0.4.0-delegate") - Notebooks
- Google Colab
- Kaggle
hexa-forge-code-7b-qwen2.5-lora-r64-v0.4.0-delegate (r40)
โ ๏ธ LABELED EXPERIMENT โ NOT GA. This is the v0.4.0 SFT delegation implementation (round 40). It missed every spec ยง11 acceptance gate and exhibits a Lever-4-RLโSFT conflict that erased the T4 enum capability. The actual v0.4.0 GA is
dancinlab/hexa-forge-code-7b-qwen2.5-lora-r64-v0.4.0-rl-t4-v3-t3patch(r39, 94.29% Mk.I). Use that one for production.
Why this exists
To document empirically that vanilla SFT cannot install routing intelligence on a saturated 7B+LoRA specialist without erasing capability.
The forge v0.4.0 design (papers/spec-delegation-v0.4.0.md in
dancinlab/hexa-codex) called for a 840-pair delegation SFT block on top
of the r39 v3-t3patch specialist. r40 executed that plan exactly, and
the result is captured here for the record.
Scores (Mk.I 665 strict on r38-fixed manifest)
| family | r39 GA | r40 (this) | ฮ |
|---|---|---|---|
| Mk.I overall | 94.29% | 82.71% | โ11.58 โ |
| T1 syntax | 97.6% | 76.5% | โ21.1 โ |
| T2 atlas | 87.0% | 78.0% | โ9.0 |
| T3 @grace | 100.0% | 98.8% | (held) |
| T4 enum | 100.0% | 77.0% | โ23.0 โ |
| T5 HX-codes | 94.8% | 86.5% | โ8.3 |
| T6 triples | 95.5% | 92.4% | โ3.1 |
| T7 stdlib | 87.9% | 89.7% | +1.8 |
| T8 refusal | 90.0% | 68.8% | โ21.2 โ |
| 5-NL i18n | 96% | 60% | โ36 โ |
| DLG-mk0 (NEW) | n/a | 0.7652 | (vs 0.85 gate) |
Diagnosis (full writeup in dancinlab/hexa-codex/lm_foundry/ROADMAP.md r40)
The 840-pair v18 delegation block was 25% of the dataset. The LoRA
gradient shared between the prior r38 GRPO compile-RL and this SFT
over-wrote the RL's T4 decision boundary โ 12 000 RL rollouts of
"emit enum Foo { ... }, not enum Foo<T> { ... }" got displaced by
~10 SFT exemplars that taught the same decision example-by-example.
See dancinlab/hexa-codex/lm_foundry/.claude/memory/feedback_lever4_rl_sft_conflict.md
([[lever4-rl-sft-conflict]] memory pointer) for the recipe lesson.
License
MIT (adapter weights). Base model: Qwen/Qwen2.5-Coder-7B.
- Downloads last month
- 27