OkayLID
OkayLID is a language identification model in FastText that is only 3 megabytes, meant for basic language detection. It can detect over 201 languages, at an extremely small size. OkayLID trained on a smaller subset of the OpenLID dataset.
Installation
pip install fasttext huggingface_hub
Usage
import numpy as np
import fasttext
from huggingface_hub import hf_hub_download
def setup_environment():
original_array = np.array
def fixed_array(obj, *args, **kwargs):
if kwargs.get("copy") is False:
return np.asarray(obj)
return original_array(obj, *args, **kwargs)
np.array = fixed_array
setup_environment()
model_path = hf_hub_download(repo_id="Cutecat6152/OkayLID", filename="OkayLID.bin")
model = fasttext.load_model(model_path)
text = "The quick brown fox jumps over the lazy dog."
labels, probs = model.predict(text, k=1)
print(f"Language: {labels[0].replace('__label__', '')}")
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support