Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
philipp-zettl 
posted an update 18 days ago
Post
198
I'm unemployed, I have a gaming GPU, and I just published a German LLM.

qwen3-0.6b-german - fine-tuned Qwen3-0.6B in ~40h on an RTX 4070 Ti, using the exact same instruct datasets as the LLäMmlein paper (ACL 2025).

HellaSwag-DE: 0.3111 → 0.3193 ✅
ARC-DE: 0.2352 → 0.2575 ✅
MMlu-DE: 0.3600 → 0.2475 🔻 (alignment tax - known trade-off)

Instruction fine-tuning trades some factual breadth for better reasoning and format following. The model is more useful, even if not better on every metric.

Weights, LoRA adapter, full training script and logs all public.

philipp-zettl/qwen3-0.6b-german

It ain't much, but it's honest work.
In this post