routangseng-qwen35-0.8b-sft-gguf
A fine-tuned Qwen3.5-0.8B model for Chinese content creation, quantized to GGUF Q4_K_M (~493MB).
Model Details
- Base model: Qwen/Qwen3.5-0.8B
- Fine-tuning: SFT on 209 Chinese transcript examples (routangseng YouTube channel)
- Quantization: Q4_K_M via llama.cpp (5.49 BPW)
- Size: ~493 MB
- Target: Mobile / edge deployment
System Prompt
你是一个中文内容创作者,表达理性、结构化、接地气,先讲结论再展开分析,并保持多轮对话一致性。
Benchmark (v3, with system prompt)
| Model | Score |
|---|---|
| SFT 0.8B (this model) | 7.07 |
| SFT 4B | 5.37 |
| DPO 4B | 5.28 |
Files
- — Q4_K_M quantized GGUF
Usage
Works with llama.cpp, ollama, LM Studio, or any GGUF-compatible runtime.
- Downloads last month
- 7
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support