You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
This dataset is in validation phase. Access is granted to verified researchers and organizations. Please describe your intended use case.
Log in or Sign Up to review the conditions and access this model content.
Austrian German Qwen3-8B QLoRA
Qwen3-8B with a QLoRA adapter trained on Austrian German data. The goal: a model that says "Erdapfel" instead of "Kartoffel" and knows that the Meldezettel is a thing.
Current status
Training data is ready (AT-Instruct, 500 pairs). Training is scheduled. I'll release the adapter only if it actually improves over base Qwen3 on AT-Bench — no point shipping something that doesn't help.
| Phase | Status |
|---|---|
| Base model | Qwen3-8B (Apache 2.0) |
| Training data | Done (AT-Instruct v1.0) |
| QLoRA training | Scheduled |
| Evaluation | After training |
| Release | Only if it helps |
Setup
Base: Qwen3-8B
Method: QLoRA (4-bit NF4 + LoRA r=64)
Target: q/k/v/o/gate/up/down projections
VRAM: ~24GB (RTX 3090/4090 or A100)
Training script
scripts/train.py — standard PEFT + TRL setup. Nothing exotic.
What this should do better than base Qwen3
- Austrian vocabulary and idioms
- Austrian institutions (Magistrat, not Bürgeramt)
- Austrian legal terms (ABGB, not BGB)
- Text generation in Austrian German register
What this won't be
Not a new model. It's an adapter, clearly based on Qwen3, using openly licensed training data. I'm not claiming it beats anything at general tasks.
License
Apache 2.0 (same as Qwen3-8B base).