whisper-small-canario_fono

This model is a fine-tuned version of openai/whisper-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.4465
Wer: 104.4996

Model description

The dataset used for this model is derived from the Islas Canarias portion of the coser dataset corpus https://huggingface.co/datasets/johnatanebonilla/coser

This model is intended for experimental purposes to explore the feasibility of using automatic speech recognition (ASR) systems, such as Whisper, to perform phonological transcription. It is not meant for production use but rather as a research tool to investigate the potential of ASR for phonological transcription tasks.

Limitations of this model include the fact that the time intervals in the COSER corpus are not systematically aligned, meaning that there may not be a perfect one-to-one correspondence between the audio and text data. This lack of alignment can introduce errors and inconsistencies in the transcriptions and limit the model's accuracy.

One significant limitation is the size of the dataset. It appears to be relatively small, and its impact on the model's performance may be limited due to the inherent challenges of training robust ASR systems with limited data.

Furthermore, despite efforts to curate the dataset and provide clean phonological transcriptions, it seems that the dataset size and quality may not significantly contribute to the model's overall performance.

Training and evaluation data

For training and evaluation, a split of 80% training data and 10% validation data was used, with both of these portions combined for training purposes.

The remaining 10% of the data was exclusively reserved for testing the model's performance.

This approach combines the initial 80% training data and the 10% validation data for model training and fine-tuning, while the test data remains separate to assess the model's generalization and performance on previously unseen data.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1266	5.38	1000	0.9951	97.9842
0.0371	10.75	2000	1.2437	109.7012
0.0197	16.13	3000	1.3983	121.5263
0.013	21.51	4000	1.4465	104.4996

Framework versions

Transformers 4.36.2
Pytorch 2.1.0+cu121
Datasets 2.16.0
Tokenizers 0.15.0

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for johnatanebonilla/whisper-small-canario_fono

Base model

openai/whisper-small

Finetuned

(3443)

this model