Instructions to use armand0e/Qwen3.5-9B-Opus-Agent-LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use armand0e/Qwen3.5-9B-Opus-Agent-LoRA with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("armand0e/Qwen3.5-9B-Opus-Agent-LoRA", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use armand0e/Qwen3.5-9B-Opus-Agent-LoRA with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for armand0e/Qwen3.5-9B-Opus-Agent-LoRA to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for armand0e/Qwen3.5-9B-Opus-Agent-LoRA to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for armand0e/Qwen3.5-9B-Opus-Agent-LoRA to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="armand0e/Qwen3.5-9B-Opus-Agent-LoRA", max_seq_length=2048, )
Qwen3.5 9B - Opus Agent
This is a finetune on Opus traces and a small dataset. Reasoning was left untouched
Total train time: 4 hours
Benchmarks
General benchmarks
Benchmarks provided by @nightmedia, as always thanks for taking the time :)
arc arc/e boolq
armand0e/Qwen3.5-9B-Opus-Agent 0.589 0.747 0.901
Jackrong/Qwopus3.5-9B-Coder 0.561 0.721 0.89
Qwen3.5-9B 0.571 0.719 0.895
Targeted benchmarks
Conducted via BenchLocal. All benchmarks are 2-shot (1 retry on failure) for ease of comparison to the numbers found in Jackrong's Qwopus3.5 Coder
All benchmarks for other models were done in Q8_0 only this model's benchmarks were done in Q4_K_M
1. Instruction Following - InstructFollow-15
InstructFollow-15 evaluates formatting, count, numbering, sentence, and length constraints.
| Instruction Following - InstructFollow-15 Metrics | |||
| Model | Test Set | Comprehensive Score | Dimension Scores (A/B/C/D/E) |
|---|---|---|---|
| armand0e/Qwen3.5-9B-Opus-Agent | InstructFollow-15 | 97 | 100 / 100 / 100 / 85 / 100 |
| Jackrong/Qwopus3.5-9B-coder | InstructFollow-15 | 93 | 100 / 100 / 100 / 67 / 100 |
2. Code Debugging & Bug Fixing - BugFind-15
BugFind-15 evaluates real debugging capability across syntax bugs, logic errors, and trap code.
| Code Debugging & Bug Fixing - BugFind-15 Metrics | |||
| Model | Test Set | Comprehensive Score | Dimension Scores (A/B/C/D/E) |
|---|---|---|---|
| armand0e/Qwen3.5-9B-Opus-Agent | BugFind-15 | 84 | 67 / 100 / 87 / 67 / 90 |
| Jackrong/Qwopus3.5-9B-coder | BugFind-15 | 79 | 67 / 87 / 100 / 77 / 43 |
| Jackrong/MLX-Qwen3.5-9B-DeepSeek-V4-Flash | BugFind-15 | 75 | 67 / 100 / 67 / 57 / 80 |
| armand0e/Qwen3.5-9B-Agent | BugFind-15 | 58 | 29 / 87 / 73 / 20 / 67 |
3. Tool Call Stability - ToolCall-15
ToolCall-15 targets stability and precision in direct tool-calling behavior.
| Tool Call Stability - ToolCall-15 Metrics | |||
| Model | Test Set | Comprehensive Score | Dimension Scores (A/B/C/D/E) |
|---|---|---|---|
| armand0e/Qwen3.5-9B-Opus-Agent | ToolCall-15 | 100 | 100 / 100 / 100 / 100 / 100 |
| Jackrong/Qwopus3.5-9B-coder | ToolCall-15 | 100 | 100 / 100 / 100 / 100 / 100 |
| Qwen/Qwen3.5-9B | ToolCall-15 | 100 | 100 / 100 / 100 / 100 / 100 |
| armand0e/Qwen3.5-9B-Agent | ToolCall-15 | 93 | 100 / 100 / 100 / 67 / 100 |
4. Complex Agent Performance - HermesAgent-20
HermesAgent-20 evaluates complex agent behavior across memory, orchestration, skill use, scheduling, and delegation.
| Complex Agent Performance - HermesAgent-20 Metrics | |||
| Model | Test Set | Comprehensive Score | Core Dimensions (Memory / Orchestration / Skills / Scheduling / Boundaries) |
|---|---|---|---|
| Jackrong/Qwopus3.5-9B-coder | HermesAgent-20 | 85 | 84 / 93 / 88 / 75 / 84 |
| armand0e/Qwen3.5-9B-Opus-Agent | HermesAgent-20 | 80 | 100 / 93 / 80 / 75 / 50 |
| Qwen/Qwen3.5-9B | HermesAgent-20 | 71 | 75 / 58 / 100 / 53 / 69 |
| armand0e/Qwen3.5-9B-Agent | HermesAgent-20 | 68 | 71 / 83 / 43 / 61 / 80 |
| DJLougen/Harmonic-Hermes-9B | HermesAgent-20 | 47 | 60 / 45 / 23 / 69 / 38 |
Click to show screenshots
Ignore the gemma-4 llama.cpp alias I had set, this was old and I forgot to change it
ToolCall-15
HermesAgent-20
BugFind-15
InstructFollow-15
Training Script
Training Script
# -*- coding: utf-8 -*-
import os
from unsloth import FastModel
import torch
from trl import SFTConfig, SFTTrainer
from teich import mask_data, prepare_data
MAX_SEQ_LEN = 32768
MODEL_NAME = os.environ.get("MODEL_NAME", "qwen/Qwen3.5-9B")
OUTPUT_DIR = os.environ.get("OUTPUT_DIR", "outputs/qwen-tool-sft")
HUB_REPO_ID = os.environ.get("HUB_REPO_ID", "armand0e/Qwen3.5-9B-Opus-Agent")
HF_TOKEN = os.environ.get("HF_TOKEN", "")
model, tokenizer = FastModel.from_pretrained(
model_name=MODEL_NAME,
max_seq_length=MAX_SEQ_LEN,
load_in_4bit=False,
load_in_8bit=False,
full_finetuning=False,
)
model = FastModel.get_peft_model(
model,
finetune_vision_layers = False, # Turn off for just text!
finetune_language_layers = True, # Should leave on!
finetune_attention_modules = True, # Attention good for GRPO
finetune_mlp_modules = True, # Should leave on always!
r = 32, # Larger = higher accuracy, but might overfit
lora_alpha = 64, # Recommended alpha == r at least
lora_dropout = 0,
bias = "none",
random_state = 3407,
)
train_dataset = prepare_data(
{
"chat": {
"source": "TeichAI/claude-4.5-opus-high-reasoning-250x"
},
"opus-agent": {
"source": "armand0e/badlogicgames-pi-mono-opus-filtered",
},
},
tokenizer,
split="train",
hf_token=HF_TOKEN,
chat_template_kwargs={"enable_thinking": True},
max_length=MAX_SEQ_LEN,
drop_oversized_examples=True,
trim_oversized_followups=True,
tokenize=True,
strict=True,
)
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=train_dataset,
eval_dataset=None,
args=SFTConfig(
dataset_text_field="text",
dataset_num_proc=1,
max_length=MAX_SEQ_LEN,
packing=False,
per_device_train_batch_size=1,
gradient_accumulation_steps=8,
warmup_steps= 5,
num_train_epochs=2,
learning_rate=2e-5,
logging_steps=1,
save_steps=100,
save_total_limit=3,
optim="adamw_8bit",
weight_decay=0.01,
max_grad_norm=0.3,
lr_scheduler_type="linear",
output_dir=OUTPUT_DIR,
seed=3407,
report_to="none",
),
)
trainer = mask_data(
trainer,
tokenizer=tokenizer,
train_on_reasoning=False,
train_on_final_answers=True,
train_on_tools=True,
)
print(trainer.train_dataset.preview())
trainer_stats = trainer.train(resume_from_checkpoint=False)
model.push_to_hub(f"{HUB_REPO_ID}-LoRA", token=HF_TOKEN)
tokenizer.push_to_hub(f"{HUB_REPO_ID}-LoRA", token=HF_TOKEN)
model.push_to_hub_merged(HUB_REPO_ID, tokenizer, save_method="merged_16bit", token=HF_TOKEN)
The data for this model was easily formatted and masked with Teich
- Developed by: armand0e
- License: apache-2.0
- Finetuned from model : Qwen/Qwen3.5-9B
This qwen3_5 model was trained 2x faster with Unsloth and Huggingface's TRL library.





