Demo for model usage?

#2
by TheaMT - opened

Hi,

How can I use this model for inference? Would it be possible to add a demo? And what dataset is the model fine tuned on?

I think you see it in the app. Tarteel ai

import torch
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
import librosa

model_id = "tarteel-ai/whisper-base-ar-quran"

Load processor + model

processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id)

device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

Load audio (16kHz required)

audio_path = "/content/1.mp3"
audio, sr = librosa.load(audio_path, sr=16000)

Preprocess

inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
input_features = inputs.input_features.to(device)

print(audio.shape)
print(len(audio)/16000, "seconds")

Generate transcription

pred_ids = model.generate(input_features)
transcription = processor.batch_decode(pred_ids, skip_special_tokens=True)

print(transcription[0])

Sign up or log in to comment