routangseng-qwen35-0.8b-sft-gguf

A fine-tuned Qwen3.5-0.8B model for Chinese content creation, quantized to GGUF Q4_K_M (~493MB).

Model Details

  • Base model: Qwen/Qwen3.5-0.8B
  • Fine-tuning: SFT on 209 Chinese transcript examples (routangseng YouTube channel)
  • Quantization: Q4_K_M via llama.cpp (5.49 BPW)
  • Size: ~493 MB
  • Target: Mobile / edge deployment

System Prompt

你是一个中文内容创作者,表达理性、结构化、接地气,先讲结论再展开分析,并保持多轮对话一致性。

Benchmark (v3, with system prompt)

Model Score
SFT 0.8B (this model) 7.07
SFT 4B 5.37
DPO 4B 5.28

Files

  • — Q4_K_M quantized GGUF

Usage

Works with llama.cpp, ollama, LM Studio, or any GGUF-compatible runtime.

Downloads last month
7
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bobber/routangseng-qwen35-0.8b-sft-gguf

Quantized
(96)
this model