Kalemat-Tech Arabic Speech Recognition Model (STT)
ูู ูุฐุฌ ููู ุงุชู ููุชุนุฑู ุนูู ุงูุฃุตูุงุช ุงูุนุฑุจูุฉ ุงููุตุญู ูุชุญููููุง ุฅูู ูุตูุต
KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small (SafeTensors)
โก This is a SafeTensors conversion of the original model.
Model Description
This model is a fine-tuned version of Whisper Small trained on Common Voice Arabic 12.0 (augmented dataset).
Performance
- Loss: 0.5362
- WER: 58.5848
What Changed in This Conversion?
| Aspect | Original | This Version |
|---|---|---|
| Weight Format | PyTorch .bin |
SafeTensors .safetensors |
| Model Architecture | Unchanged | Unchanged |
| Weights / Performance | Baseline | Identical (lossless) |
| Loading Speed | Standard | Faster |
| Security | Standard | Improved |
Why SafeTensors?
- Safer: Prevents arbitrary code execution
- Faster: Memory-mapped loading
- Efficient: Lower memory usage
Usage Example
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("YOUR_USERNAME/KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small-SafeTensors")
model = AutoModelForSpeechSeq2Seq.from_pretrained("YOUR_USERNAME/KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small-SafeTensors")
Intended Use
Automatic Speech Recognition for Arabic (Modern Standard Arabic)
Limitations
- High WER (~58%)
- May struggle with dialects and noisy audio
Training Data
- Common Voice Arabic 12.0
Augmentations
- 25% TimeMasking
- 25% SpecAugmentation
- 25% Gaussian Noise
Training Hyperparameters
- Learning rate: 1e-05
- Train batch size: 64
- Eval batch size: 8
- Epochs: 25
- Optimizer: Adam
- Scheduler: Linear
- Warmup steps: 500
- Mixed precision: AMP
Training Results
| Epoch | Training Loss | Validation Loss | WER |
|---|---|---|---|
| 1 | 0.2728 | 0.3063 | 60.47 |
| 2 | 0.1442 | 0.2878 | 55.69 |
| 3 | 0.0648 | 0.3009 | 59.25 |
| 4 | 0.0318 | 0.3278 | 59.29 |
| 5 | 0.0148 | 0.3539 | 61.03 |
| 6 | 0.0088 | 0.3714 | 56.91 |
| 7 | 0.0061 | 0.3920 | 57.55 |
| 8 | 0.0041 | 0.4149 | 61.63 |
| 9 | 0.0033 | 0.4217 | 58.03 |
| 10 | 0.0033 | 0.4376 | 59.95 |
| 15 | 0.0008 | 0.4856 | 60.71 |
| 20 | 0.0002 | 0.5155 | 58.09 |
| 24 | 0.0001 | 0.5362 | 58.58 |
Framework Versions
- Transformers 4.25.1
- PyTorch 1.13.1+cu117
- Datasets 2.8.0
- Tokenizers 0.13.2
Conversion Details
- Format: SafeTensors
- Type: Lossless conversion
- Weights: Numerically identical
Credits
- Original Model: Mohamed Salama
- Conversion: SafeTensors format for better performance and security
- Downloads last month
- 41
Evaluation results
- wer on mozilla-foundation/common_voice_12_0test set self-reported58.585