routangseng-qwen35-0.8b-sft-gguf

A fine-tuned Qwen3.5-0.8B model for Chinese content creation, quantized to GGUF Q4_K_M (~493MB).

Model Details

Base model: Qwen/Qwen3.5-0.8B
Fine-tuning: SFT on 209 Chinese transcript examples (routangseng YouTube channel)
Quantization: Q4_K_M via llama.cpp (5.49 BPW)
Size: ~493 MB
Target: Mobile / edge deployment

你是一个中文内容创作者，表达理性、结构化、接地气，先讲结论再展开分析，并保持多轮对话一致性。

Works with llama.cpp, ollama, LM Studio, or any GGUF-compatible runtime.

GGUF

Model size

0.8B params

Architecture

qwen35

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Quantized

(96)

this model