Qwen2.5-Omni-3B Decoder (GGUF)
Text-only decoder extracted from Qwen/Qwen2.5-Omni-3B.
Architecture
- Type: Qwen2VL (text decoder)
- Parameters: 3.4B (decoder only, excluding vision/audio/talker/token2wav)
- Hidden size: 2048
- Layers: 36
- Attention heads: 16 (KV heads: 2, GQA)
- FFN size: 11008
- Vocab: 151,936
- Context: 32,768 tokens
Files
| File | Size | Description |
|---|---|---|
Qwen2.5-Omni-3B-decoder-F16.gguf |
6.4 GB | Full precision (FP16) |
Usage with llama.cpp
llama-cli -m Qwen2.5-Omni-3B-decoder-F16.gguf -p "Hello" -n 100 -no-cnv
Extraction
Extracted using convert_hf_to_gguf.py from llama.cpp. The converter automatically strips thinker. prefix and drops vision/audio/talker/token2wav components, keeping only the text decoder (435 tensors).
- Downloads last month
- 30
Hardware compatibility
Log In to add your hardware
16-bit