keysay-transcription-cleaner-0.8B-8bit
A fine-tuned Qwen3.5-0.8B VLM that cleans speech transcriptions into ready-to-send chat messages. Removes self-corrections and filler words while preserving the speaker's intended meaning.
What it does
| Input (messy speech) | Output (clean message) |
|---|---|
| es a las 5, no perd贸n, a las 6 | es a las 6 |
| I think we should, um, you know, go with the first option, no wait, the second one is better | I think we should go with the second one is better |
| Bueno, pues, o sea, la reuni贸n es el martes a las tres. | La reuni贸n es el martes a las tres. |
| el precio es 100, no 200, no perd贸n, 150 euros | el precio es 150 euros |
| habla con Mar铆a, no perd贸n, con Carlos sobre el proyecto | habla con Carlos sobre el proyecto |
Training
- Method: LoRA fine-tuning + fusion
- Base model: mlx-community/Qwen3.5-0.8B-8bit
- Teacher: Google Gemini Flash (knowledge distillation)
- Dataset: 360 examples (70% Spanish, 30% English) covering self-corrections, filler removal, nested corrections, and passthrough
- Eval score: 12/12 on hand-verified test suite (base model: 0/12)
Usage
from mlx_lm import load, generate
model, tokenizer = load("Enriqueag26/keysay-transcription-cleaner-0.8B-8bit")
system = """You clean speech-to-text transcriptions into ready-to-send chat messages.
Remove self-corrections (keep only the final version). Remove filler words.
Never rephrase. Never add words. Keep the original language."""
messages = [
{"role": "system", "content": system},
{"role": "user", "content": "es a las 5, no perd贸n, a las 6"},
]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True,
enable_thinking=False,
)
result = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(result) # "es a las 6"
Part of keysay
A macOS press-to-dictate app that uses Qwen3-ASR for speech recognition, a VLM for screen context extraction, and this model for transcription cleaning.
- Downloads last month
- 30
Model size
0.2B params
Tensor type
BF16
路
U32 路
F32 路
Hardware compatibility
Log In to add your hardware
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support