Wav2Vec2-BERT Twi ASR (CTranslate2 Optimized)
This is a CTranslate2 optimized version of the original model for fast inference.
Original Model: ghananlpcommunity/w2v-bert-2.0_twi_alpha_v1_farmerline
Quantization: float16
Installation
pip install ctranslate2 transformers huggingface_hub librosa numpy
Usage (Inference)
import ctranslate2
import librosa
import numpy as np
from transformers import AutoProcessor
# 1. Load audio (ensure 16kHz)
audio, sr = librosa.load("audio.wav", sr=16000)
# 2. Load processor and model
processor = AutoProcessor.from_pretrained("ghananlpcommunity/w2v-bert-2.0_twi_alpha_v1_farmerline-ct2")
# Use "cuda" for GPU or "cpu" for CPU
model = ctranslate2.models.Wav2Vec2("ghananlpcommunity/w2v-bert-2.0_twi_alpha_v1_farmerline-ct2", device="cuda")
# 3. Process audio
# Wav2Vec2-BERT usually requires feature_extractor logic via the processor
inputs = processor(audio, sampling_rate=16000, return_tensors="np")
input_features = inputs.input_features.astype(np.float32)
# 4. Run inference
outputs = model.encode(input_features)
# 5. Decode
predicted_ids = np.argmax(outputs, axis=-1)
transcription = processor.batch_decode(predicted_ids)
print("Transcription:", transcription[0])
- Downloads last month
- 13