Kalemat-Tech Arabic Speech Recognition Model (STT)

نموذج كلماتك للتعرف على الأصوات العربية الفصحى وتحويلها إلى نصوص

KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small (SafeTensors)

⚡ This is a SafeTensors conversion of the original model.

Model Description

This model is a fine-tuned version of Whisper Small trained on Common Voice Arabic 12.0 (augmented dataset).

Performance

Loss: 0.5362
WER: 58.5848

What Changed in This Conversion?

Aspect	Original	This Version
Weight Format	PyTorch `.bin`	SafeTensors `.safetensors`
Model Architecture	Unchanged	Unchanged
Weights / Performance	Baseline	Identical (lossless)
Loading Speed	Standard	Faster
Security	Standard	Improved

Why SafeTensors?

Safer: Prevents arbitrary code execution
Faster: Memory-mapped loading
Efficient: Lower memory usage

Usage Example

from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq

processor = AutoProcessor.from_pretrained("YOUR_USERNAME/KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small-SafeTensors")
model = AutoModelForSpeechSeq2Seq.from_pretrained("YOUR_USERNAME/KalemaTech-Arabic-STT-ASR-based-on-Whisper-Small-SafeTensors")

Intended Use

Automatic Speech Recognition for Arabic (Modern Standard Arabic)

Limitations

High WER (~58%)
May struggle with dialects and noisy audio

Training Data

Common Voice Arabic 12.0

Augmentations

25% TimeMasking
25% SpecAugmentation
25% Gaussian Noise

Training Hyperparameters

Learning rate: 1e-05
Train batch size: 64
Eval batch size: 8
Epochs: 25
Optimizer: Adam
Scheduler: Linear
Warmup steps: 500
Mixed precision: AMP

Training Results

Epoch	Training Loss	Validation Loss	WER
1	0.2728	0.3063	60.47
2	0.1442	0.2878	55.69
3	0.0648	0.3009	59.25
4	0.0318	0.3278	59.29
5	0.0148	0.3539	61.03
6	0.0088	0.3714	56.91
7	0.0061	0.3920	57.55
8	0.0041	0.4149	61.63
9	0.0033	0.4217	58.03
10	0.0033	0.4376	59.95
15	0.0008	0.4856	60.71
20	0.0002	0.5155	58.09
24	0.0001	0.5362	58.58

Framework Versions

Transformers 4.25.1
PyTorch 1.13.1+cu117
Datasets 2.8.0
Tokenizers 0.13.2

Conversion Details

Format: SafeTensors
Type: Lossless conversion
Weights: Numerically identical

Credits

Original Model: Mohamed Salama
Conversion: SafeTensors format for better performance and security

Downloads last month: 41

Safetensors

Model size

0.2B params

Tensor type

F32

Evaluation results

wer on mozilla-foundation/common_voice_12_0
test set self-reported

58.585