2025-07-23: CER:

Dataset	Lang	Split	CER(in %)
Training	yue	validation	10.65
mozilla-foundation/common_voice_17_0	yue	test	1.188
mozilla-foundation/common_voice_17_0	en	test(2k samples)	7.583
mozilla-foundation/common_voice_16_1	zh-CN	test	13.96
JackyHoCL/cleaned_mixed_cantonese_and_english_speech	yue	test	11.9775

2025-07-04:
CER:

Dataset	Lang	Split	CER(in %)
Training	yue	validation	11.39
mozilla-foundation/common_voice_17_0	yue	test
mozilla-foundation/common_voice_16_1	yue	test	12.2
JackyHoCL/cleaned_mixed_cantonese_and_english_speech	yue	test

per_device_train_batch_size=96,
learning_rate=1e-6,

CER: 15.4%

transformers-4.46.3

Train Args:
per_device_train_batch_size=32,
gradient_accumulation_steps=1,
learning_rate=1e-5,
gradient_checkpointing=True,
per_device_eval_batch_size=64,
generation_max_length=225,

Hardware:
NVIDIA Tesla V100 16GB * 4

A Realtime Streaming application example is built on this model:
https://github.com/JackyHoCL/whisper-realtime.git

FAQ:

If having tokenizer issue during inference, please update your transformers version to >= 4.49.0

pip install --upgrade transformers

Downloads last month: 50

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for JackyHoCL/whisper-small-cantonese-yue-english

Base model

openai/whisper-small

Finetuned

(3445)

this model

Finetunes

1 model

JackyHoCL
/

whisper-small-cantonese-yue-english

Model tree for JackyHoCL/whisper-small-cantonese-yue-english

Datasets used to train JackyHoCL/whisper-small-cantonese-yue-english