whisper-small-canario_fono
This model is a fine-tuned version of openai/whisper-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.4465
- Wer: 104.4996
Model description
The dataset used for this model is derived from the Islas Canarias portion of the coser dataset corpus https://huggingface.co/datasets/johnatanebonilla/coser
This model is intended for experimental purposes to explore the feasibility of using automatic speech recognition (ASR) systems, such as Whisper, to perform phonological transcription. It is not meant for production use but rather as a research tool to investigate the potential of ASR for phonological transcription tasks.
Limitations of this model include the fact that the time intervals in the COSER corpus are not systematically aligned, meaning that there may not be a perfect one-to-one correspondence between the audio and text data. This lack of alignment can introduce errors and inconsistencies in the transcriptions and limit the model's accuracy.
One significant limitation is the size of the dataset. It appears to be relatively small, and its impact on the model's performance may be limited due to the inherent challenges of training robust ASR systems with limited data.
Furthermore, despite efforts to curate the dataset and provide clean phonological transcriptions, it seems that the dataset size and quality may not significantly contribute to the model's overall performance.
Training and evaluation data
For training and evaluation, a split of 80% training data and 10% validation data was used, with both of these portions combined for training purposes.
The remaining 10% of the data was exclusively reserved for testing the model's performance.
This approach combines the initial 80% training data and the 10% validation data for model training and fine-tuning, while the test data remains separate to assess the model's generalization and performance on previously unseen data.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 4000
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.1266 | 5.38 | 1000 | 0.9951 | 97.9842 |
| 0.0371 | 10.75 | 2000 | 1.2437 | 109.7012 |
| 0.0197 | 16.13 | 3000 | 1.3983 | 121.5263 |
| 0.013 | 21.51 | 4000 | 1.4465 | 104.4996 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.0
- Tokenizers 0.15.0
- Downloads last month
- 1
Model tree for johnatanebonilla/whisper-small-canario_fono
Base model
openai/whisper-small