keysay-transcription-cleaner-0.8B-8bit

A fine-tuned Qwen3.5-0.8B VLM that cleans speech transcriptions into ready-to-send chat messages. Removes self-corrections and filler words while preserving the speaker's intended meaning.

What it does

Input (messy speech)	Output (clean message)
es a las 5, no perdón, a las 6	es a las 6
I think we should, um, you know, go with the first option, no wait, the second one is better	I think we should go with the second one is better
Bueno, pues, o sea, la reunión es el martes a las tres.	La reunión es el martes a las tres.
el precio es 100, no 200, no perdón, 150 euros	el precio es 150 euros
habla con María, no perdón, con Carlos sobre el proyecto	habla con Carlos sobre el proyecto

Training

Method: LoRA fine-tuning + fusion
Base model: mlx-community/Qwen3.5-0.8B-8bit
Teacher: Google Gemini Flash (knowledge distillation)
Dataset: 360 examples (70% Spanish, 30% English) covering self-corrections, filler removal, nested corrections, and passthrough
Eval score: 12/12 on hand-verified test suite (base model: 0/12)

Usage

from mlx_lm import load, generate

model, tokenizer = load("Enriqueag26/keysay-transcription-cleaner-0.8B-8bit")

system = """You clean speech-to-text transcriptions into ready-to-send chat messages.

Remove self-corrections (keep only the final version). Remove filler words.
Never rephrase. Never add words. Keep the original language."""

messages = [
    {"role": "system", "content": system},
    {"role": "user", "content": "es a las 5, no perdón, a las 6"},
]
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True,
    enable_thinking=False,
)
result = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(result)  # "es a las 6"

Part of keysay

A macOS press-to-dictate app that uses Qwen3-ASR for speech recognition, a VLM for screen context extraction, and this model for transcription cleaning.

Downloads last month: 30

Safetensors

Model size

0.2B params

Tensor type

BF16

U32

F32

MLX

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Enriqueag26/keysay-transcription-cleaner-0.8B-8bit

Base model

Qwen/Qwen3.5-0.8B-Base

Quantized

mlx-community/Qwen3.5-0.8B-8bit

Adapter

(1)

this model