Qwen3.5-0.8B Text-Only
Text-only weights extracted from Qwen/Qwen3.5-0.8B (VLM) for use with vLLM's Qwen3_5ForCausalLM architecture.
What this is
Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use Qwen3_5ForConditionalGeneration with weights prefixed as model.language_model.*. This repo provides the language model backbone only, with:
architectures: ["Qwen3_5ForCausalLM"]model_type: "qwen3_5_text"- Weight keys at
model.layers.*(standard causal LM format, nolanguage_model.prefix) - Vision encoder and MTP weights removed
Model structure
- Architecture: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
- Parameters: ~0.8B (language model only, no vision encoder)
- Dtype: bfloat16
How to use with vLLM
from vllm import LLM
llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True)
- Downloads last month
- 12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support