🧠 Qwen 3.5 35B-A3B β€” Cagatay LoRA

A LoRA fine-tune of Qwen/Qwen3.5-35B-A3B β€” a 35B Mixture-of-Experts model with only 3B active parameters.

Model Space

🎯 What is this?

A LoRA adapter for the Qwen 3.5 35B MoE (Mixture-of-Experts) model. This architecture gives you 35B-level reasoning with only 3B active parameters per forward pass β€” making it surprisingly efficient for complex robotics task planning.

Fine-tuned using SFT via TRL on HuggingFace Jobs.

⚑ Why MoE for Robotics?

Property Benefit
35B total params Deep reasoning capacity for complex multi-step tasks
3B active params Fast inference β€” only 3B params compute per token
Expert routing Different experts specialize in different command types
Efficient LoRA Only 50MB adapter on top of base model

πŸ“Š Training Details

Parameter Value
Base Model Qwen/Qwen3.5-35B-A3B (MoE)
Architecture Mixture-of-Experts (35B total, 3B active)
Method LoRA (PEFT) + SFT (TRL)
Rank (r) 32
Alpha 64
Dropout 0.05
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Adapter Size 50 MB
Framework TRL 0.29.1, Transformers 5.3.0, PyTorch 2.10.0, PEFT 0.18.1
Training HuggingFace Jobs (cloud GPU)

πŸš€ Quick Start

from transformers import pipeline

generator = pipeline(
    "text-generation",
    model="cagataydev/qwen3.5-35B-A3B-cagatay",
    device_map="auto",
    torch_dtype="auto"
)

# Complex multi-step robotics reasoning
output = generator(
    [{"role": "user", "content": "You're a household robot. The kitchen is messy after cooking. Plan a complete cleanup sequence, considering what needs to be done first and why."}],
    max_new_tokens=512,
    return_full_text=False
)[0]
print(output["generated_text"])

With PEFT (explicit)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.5-35B-A3B",
    torch_dtype="auto",
    device_map="auto"
)
model = PeftModel.from_pretrained(base, "cagataydev/qwen3.5-35B-A3B-cagatay")
tokenizer = AutoTokenizer.from_pretrained("cagataydev/qwen3.5-35B-A3B-cagatay")

πŸ€– Use Cases

  • Complex task planning β€” Multi-step reasoning with dependency awareness
  • Household robotics β€” Full cleanup/cooking/organizing sequences
  • Safety-aware planning β€” Considers order of operations and risks
  • Neon VLA reasoning engine β€” Highest-capability model in the Neon stack

πŸ“¦ Model Family

Model Base Total / Active Best For
qwen2.5-omni-3b Qwen 2.5 3B 1.8B / 1.8B Voice commands
qwen3.5-4B Qwen 3.5 4B 4B / 4B Simple task planning
qwen3.5-35B-A3B Qwen 3.5 35B MoE 35B / 3B Complex reasoning (this)

πŸ’‘ Hardware Requirements

Setup Works? Notes
A100 80GB βœ… Full precision
L40S 46GB βœ… bf16
RTX 4090 24GB βœ… 4-bit quantization (GPTQ/AWQ)
Jetson Orin 32GB ⚠️ Needs quantization
Consumer 16GB ❌ Too large even quantized

Built with DevDuck πŸ¦† | Trained on HuggingFace Jobs | Part of the Neon VLA ecosystem

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for cagataydev/qwen3.5-35B-A3B-cagatay

Adapter
(25)
this model