PEFT
Safetensors
lora
orchestrator
qwen3
tool-use
agent

Qwen3-8B Orchestrator LoRA โ€” Successful Only

LoRA fine-tuned Qwen/Qwen3-8B for agentic orchestration tasks (tool use, multi-turn reasoning, web search). Trained exclusively on successful traces.

Training Details

Parameter Value
Base Model Qwen/Qwen3-8B
LoRA Rank 64
LoRA Alpha 128
Learning Rate 1.72e-04
Epochs 4
Validation Loss 0.1947
Training Samples 35,976
Dataset GLM-4.7-flash SFT traces (successful traces only)
Context Length 16,384 tokens
Quantization 4-bit (QLoRA)

Evaluation Results

Benchmark Score
SimpleQA (200) 11.0% accuracy
GAIA (165) 9.1% accuracy
HLE (200) 6.0% accuracy
DeepResearch (100) 0.2206 score

Usage

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

model = AutoPeftModelForCausalLM.from_pretrained(
    "akenginorhun/qwen3-8b-orchestrator-lora-successful-only",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(
    "akenginorhun/qwen3-8b-orchestrator-lora-successful-only"
)

W&B Sweep

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for akenginorhun/qwen3-8b-orchestrator-lora-successful-only

Finetuned
Qwen/Qwen3-8B
Adapter
(1071)
this model