Qwen2.5-0.5B Singlish → Sinhala Transliterator
This model transliterates Singlish (romanized Sinhala) to Sinhala Unicode script.
Training
- Base model:
Qwen/Qwen2.5-0.5B - Three-phase LoRA fine-tuning on ~1M phonetic pairs + ~12K adhoc pairs
- LoRA rank: 64, alpha: 128
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Pudamya/Qwen2.5-Singlish-Sinhala")
tokenizer = AutoTokenizer.from_pretrained("Pudamya/Qwen2.5-Singlish-Sinhala")
def transliterate(text):
messages = [
{"role": "system", "content": "You are a Sinhala transliteration expert. Convert Singlish (romanized Sinhala) to Sinhala Unicode script accurately."},
{"role": "user", "content": "Transliterate the following Singlish text to Sinhala:\n" + text},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
enc = tokenizer(prompt, return_tensors="pt")
out = model.generate(**enc, max_new_tokens=80, do_sample=False, num_beams=2)
return tokenizer.decode(out[0][enc["input_ids"].shape[1]:], skip_special_tokens=True).strip()
print(transliterate("mama kohomada")) # → මම කොහොමද
- Downloads last month
- 8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support