Qwen3.5-0.8B Text-Only

Text-only weights extracted from Qwen/Qwen3.5-0.8B (VLM) for use with vLLM's Qwen3_5ForCausalLM architecture.

What this is

Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use Qwen3_5ForConditionalGeneration with weights prefixed as model.language_model.*. This repo provides the language model backbone only, with:

architectures: ["Qwen3_5ForCausalLM"]
model_type: "qwen3_5_text"
Weight keys at model.layers.* (standard causal LM format, no language_model. prefix)
Vision encoder and MTP weights removed

Model structure

Architecture: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
Parameters: ~0.8B (language model only, no vision encoder)
Dtype: bfloat16

How to use with vLLM

from vllm import LLM
llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True)

Downloads last month: 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for codecho/Qwen3.5-0.8B-text-only

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B

Finetuned

(155)

this model