Demo for model usage?
Hi,
How can I use this model for inference? Would it be possible to add a demo? And what dataset is the model fine tuned on?
I think you see it in the app. Tarteel ai
import torch
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
import librosa
model_id = "tarteel-ai/whisper-base-ar-quran"
Load processor + model
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
Load audio (16kHz required)
audio_path = "/content/1.mp3"
audio, sr = librosa.load(audio_path, sr=16000)
Preprocess
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
input_features = inputs.input_features.to(device)
print(audio.shape)
print(len(audio)/16000, "seconds")
Generate transcription
pred_ids = model.generate(input_features)
transcription = processor.batch_decode(pred_ids, skip_special_tokens=True)
print(transcription[0])