WordLlama -based-LID-models
Collection
This is the collection of LID models trained based on the structures of β’ 1 item β’ Updated
This is the expand version of dleemiller/WordLlamaDetect model. Stacking two wordllama-based models to enhance the performace
Training data (740k samples)
β
βΌ
βββββββββββββββββββββββββββββββββββββ
β Phase 1: Base Models β
β β
β βββββββββββββββ βββββββββββββββ β
β βLID Model 01 β β LID Model 02β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ β
βββββββββββΌβββββββββββββββββΌβββββββββ
β train each β
β independently β
βΌ βΌ
lid_models[0] lid_models[1]
β β
βββββββββ¬βββββββββ
β
βΌ
collect_preds() β X: (N, 2*148) = (N, 296)
model1 logits model2 logits
(N, 148) cat (N, 148)
ββββββββββ¬βββββββββββ
βΌ
(N, 296)
β
Linear(296 β 148) β 296*148 = 43,808 params trained
β
βΌ
(N, 148) β CrossEntropy(y)
| Pair | Num Languages | Accuracy | F1 Macro | Metric per Base Model |
|---|---|---|---|---|
| gemma3_27b + gemma_300m | 148 | 0.9307 | 0.9303 | gemma3_27b: Acc: 0.9147, F1: 0.9149 gemma_300m: Acc: 0.9087, F1: 0.9078 |
import sys
from pathlib import Path
from huggingface_hub import snapshot_download
# Download all files
local_dir = snapshot_download(repo_id="Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")
# Load model.py
sys.path.insert(0, local_dir)
from model import LIDStack
# Load model
model = LIDStack.from_pretrained("Bonkh/lid-stack-model-gemma_3_27b-gemma_3_300m")
# Inference
print(model.predict("Hello, how are you?")) # β "eng_Latn"
print(model.predict(["Bonjour", "γγγ«γ‘γ―"])) # β ["fra_Latn", "jpn_Jpan"]
print(model.predict("Xin chΓ o", return_probs=True)) # β [("vie_Latn", 0.97)]
Base model
dleemiller/WordLlamaDetect