Qwen3-8B Orchestrator LoRA โ Successful Only
LoRA fine-tuned Qwen/Qwen3-8B for agentic orchestration tasks (tool use, multi-turn reasoning, web search). Trained exclusively on successful traces.
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3-8B |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Learning Rate | 1.72e-04 |
| Epochs | 4 |
| Validation Loss | 0.1947 |
| Training Samples | 35,976 |
| Dataset | GLM-4.7-flash SFT traces (successful traces only) |
| Context Length | 16,384 tokens |
| Quantization | 4-bit (QLoRA) |
Evaluation Results
| Benchmark | Score |
|---|---|
| SimpleQA (200) | 11.0% accuracy |
| GAIA (165) | 9.1% accuracy |
| HLE (200) | 6.0% accuracy |
| DeepResearch (100) | 0.2206 score |
Usage
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"akenginorhun/qwen3-8b-orchestrator-lora-successful-only",
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(
"akenginorhun/qwen3-8b-orchestrator-lora-successful-only"
)
W&B Sweep
- Sweep ID:
4j1os87c - Project: hazy-research/limit-successful
- Method: Bayesian optimization (12 runs)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support