Qwen3.5-0.8B Text-Only

Text-only weights extracted from Qwen/Qwen3.5-0.8B (VLM) for use with vLLM's Qwen3_5ForCausalLM architecture.

What this is

Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use Qwen3_5ForConditionalGeneration with weights prefixed as model.language_model.*. This repo provides the language model backbone only, with:

  • architectures: ["Qwen3_5ForCausalLM"]
  • model_type: "qwen3_5_text"
  • Weight keys at model.layers.* (standard causal LM format, no language_model. prefix)
  • Vision encoder and MTP weights removed

Model structure

  • Architecture: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
  • Parameters: ~0.8B (language model only, no vision encoder)
  • Dtype: bfloat16

How to use with vLLM

from vllm import LLM
llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True)
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for codecho/Qwen3.5-0.8B-text-only

Finetuned
(155)
this model