TATA: Teach, Align, Transduce, Adapt
A FastConformer-RNN-T model (32M parameters) for Egyptian Arabic speech recognition with an integrated speaker diarization pipeline. Developed for the MTC-AIC II competition where it placed 2nd with a Mean Levenshtein Distance of 9.59.
Quick Start
Installation
conda install -c conda-forge llvmlite numba
pip install "tata-asr @ git+https://huggingface.co/yousefkotp/TATA-egyptian-arabic-asr-diarization"
ASR Only (3 lines)
from tata import TATA
model = TATA()
text = model.transcribe("audio.wav")
print(text)
ASR + Speaker Diarization (4 lines)
from tata import TATA
model = TATA()
segments = model.transcribe("audio.wav", diarize=True)
for seg in segments:
print(f"[Speaker {seg.speaker}] {seg.start:.1f}s - {seg.end:.1f}s: {seg.text}")
Model weights and tokenizer are downloaded automatically on first use.
Architecture
| Component | Details |
|---|---|
| ASR Encoder | FastConformer, 16 layers, d_model=256, 4 heads |
| ASR Decoder | RNN-T with pred_hidden=640 |
| Tokenizer | BPE, vocab_size=256, trained on Egyptian Arabic |
| VAD | MarbleNet (multilingual) |
| Speaker Embeddings | TitaNet-Large + ECAPA-TDNN (concatenated, 384-dim) |
| Clustering | Agglomerative Hierarchical Clustering |
| Source Separation | Demucs (htdemucs, vocal isolation) |
Training
Trained from scratch using a four-stage curriculum:
- Teach -- CTC pretraining on LLM-generated synthetic speech
- Align -- CTC fine-tuning on real Egyptian Arabic data
- Transduce -- RNN-T training with encoder-only transfer from CTC
- Adapt -- Domain adaptation on competition-specific data
Hardware: Single NVIDIA P100 (16 GB), FP32 precision.
Citation
@misc{tata2025,
title={Fake It, Then Make It: Synthetic-to-Real Training for Egyptian Arabic ASR with Diarization},
author={Kotp, Yousef and Alaa, Karim and El-nenaey, Abdelrahman and Barakat, Rana and Zahran, Loaui and El Yamany, Ismael},
year={2025},
institution={Alexandria University}
}
License
Apache 2.0
Links
- Code: github.com/yousefkotp/Egyptian-Arabic-ASR-and-Diarization
- Competition: MTC-AIC II on Kaggle
- Downloads last month
- -