๐Ÿฐ ShweYon-V3-Base (แ€›แ€ฝแ€พแ€ฑแ€šแ€ฏแ€”แ€บ-V3)

ShweYon-V3-Base แ€žแ€Šแ€บ Qwen 2.5 1.5B แ€€แ€ญแ€ฏ แ€กแ€แ€ผแ€ฑแ€แ€ถแ แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€˜แ€ฌแ€žแ€ฌแ€…แ€€แ€ฌแ€ธแ€กแ€แ€ฝแ€€แ€บ แ€กแ€‘แ€ฐแ€ธแ€•แ€ผแ€ฏแ€•แ€ผแ€„แ€บแ€‘แ€ฌแ€ธแ€žแ€ฑแ€ฌ Base Model แ€–แ€ผแ€…แ€บแ€•แ€ซแ€žแ€Šแ€บแ‹ แ€ค Version แ€แ€ฝแ€„แ€บ แ€šแ€แ€„แ€บ Version แ€™แ€ปแ€ฌแ€ธแ€€แ€ฒแ€ทแ€žแ€ญแ€ฏแ€ท Tokenizer แ€žแ€ฎแ€ธแ€แ€ผแ€ฌแ€ธแ€žแ€ฏแ€ถแ€ธแ€›แ€”แ€บ แ€™แ€œแ€ญแ€ฏแ€แ€ฑแ€ฌแ€ทแ€˜แ€ฒ Model แ Embedding แ€‘แ€ฒแ€žแ€ญแ€ฏแ€ท แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€แ€ฏแ€ถแ€€แ€„แ€บแ€™แ€ปแ€ฌแ€ธแ€€แ€ญแ€ฏ แ€แ€ญแ€ฏแ€€แ€บแ€›แ€ญแ€ฏแ€€แ€บแ€•แ€ฑแ€ซแ€„แ€บแ€ธแ€…แ€•แ€บแ€‘แ€ฌแ€ธแ€•แ€ซแ€žแ€Šแ€บแ‹ ShweYon-V3-Base is a Myanmar-centric base language model built on top of the Qwen 2.5 1.5B architecture. This model is a milestone in the "ShweYon" project, focusing on improving the efficiency of Myanmar script processing through a custom tokenizer.

๐ŸŽฏ Purpose (แ€›แ€Šแ€บแ€›แ€ฝแ€šแ€บแ€แ€ปแ€€แ€บ)

แ€ค Model แ€žแ€Šแ€บ แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€˜แ€ฌแ€žแ€ฌแ€…แ€€แ€ฌแ€ธแ€กแ€แ€ฝแ€€แ€บ Foundation Base Model แ€แ€…แ€บแ€แ€ฏแ€กแ€–แ€ผแ€…แ€บ แ€›แ€Šแ€บแ€›แ€ฝแ€šแ€บแ€•แ€ซแ€žแ€Šแ€บแ‹ แ€ค Model แ€€แ€ญแ€ฏ แ€กแ€แ€ผแ€ฑแ€แ€ถแ Chatbot แ€™แ€ปแ€ฌแ€ธแŠ Question Answering แ€…แ€”แ€…แ€บแ€™แ€ปแ€ฌแ€ธแ€”แ€พแ€„แ€ทแ€บ แ€กแ€แ€ผแ€ฌแ€ธแ€žแ€ฑแ€ฌ Downstream NLP Task แ€™แ€ปแ€ฌแ€ธแ€กแ€แ€ฝแ€€แ€บ แ€‘แ€•แ€บแ€™แ€ถแ Fine-tuning (SFT/RLHF) แ€•แ€ผแ€ฏแ€œแ€ฏแ€•แ€บแ€›แ€”แ€บ แ€กแ€€แ€ฑแ€ฌแ€„แ€บแ€ธแ€†แ€ฏแ€ถแ€ธ แ€กแ€ฏแ€แ€บแ€™แ€ผแ€…แ€บแ€–แ€ผแ€…แ€บแ€•แ€ซแ€žแ€Šแ€บแ‹

โœจ Technical Highlights

  • Integrated Tokenizer: แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€แ€ญแ€˜แ€แ€บแ€™แ€ปแ€ฌแ€ธแ€”แ€พแ€„แ€ทแ€บ แ€…แ€€แ€ฌแ€ธแ€œแ€ฏแ€ถแ€ธแ€•แ€ฑแ€ซแ€„แ€บแ€ธ แ‰,แ€แ€แ€ แ€€แ€ปแ€ฑแ€ฌแ€บ แ€•แ€ซแ€แ€„แ€บแ€žแ€ฑแ€ฌ Custom Tokenizer แ€€แ€ญแ€ฏ แ€แ€…แ€บแ€•แ€ซแ€แ€Šแ€บแ€ธ แ€‘แ€Šแ€ทแ€บแ€žแ€ฝแ€„แ€บแ€ธแ€‘แ€ฌแ€ธแ€•แ€ซแ€žแ€Šแ€บแ‹
  • Extended Vocabulary: Vocabulary Size แ€€แ€ญแ€ฏ 160,746 แ€กแ€‘แ€ญ แ€แ€ญแ€ฏแ€ธแ€™แ€ผแ€พแ€„แ€ทแ€บแ€‘แ€ฌแ€ธแ€žแ€–แ€ผแ€„แ€ทแ€บ แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€…แ€ฌแ€žแ€ฌแ€ธแ€™แ€ปแ€ฌแ€ธแ€€แ€ญแ€ฏ แ€•แ€ญแ€ฏแ€™แ€ญแ€ฏแ€€แ€ปแ€…แ€บแ€œแ€ปแ€…แ€บแ€…แ€ฝแ€ฌแ€”แ€พแ€„แ€ทแ€บ แ€™แ€ผแ€”แ€บแ€†แ€”แ€บแ€…แ€ฝแ€ฌ แ€แ€ฝแ€€แ€บแ€แ€ปแ€€แ€บแ€”แ€ญแ€ฏแ€„แ€บแ€•แ€ซแ€žแ€Šแ€บแ‹
  • Base Training: แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€…แ€ฌแ€•แ€ฑ แ€…แ€ฌแ€กแ€ฏแ€•แ€บแ€™แ€ปแ€ฌแ€ธแ€…แ€ฝแ€ฌแ€–แ€ผแ€„แ€ทแ€บ Model แ แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€…แ€ฌ แ€กแ€แ€ผแ€ฑแ€แ€ถแ€—แ€Ÿแ€ฏแ€žแ€ฏแ€ แ€•แ€ญแ€ฏแ€™แ€ญแ€ฏแ€€แ€ฑแ€ฌแ€„แ€บแ€ธแ€™แ€ฝแ€”แ€บแ€œแ€ฌแ€…แ€ฑแ€›แ€”แ€บ แ€œแ€ฑแ€ทแ€€แ€ปแ€„แ€ทแ€บแ€•แ€ฑแ€ธแ€‘แ€ฌแ€ธแ€•แ€ซแ€žแ€Šแ€บแ‹

๐Ÿš€ Quick Start

แ€ค Base Model แ€€แ€ญแ€ฏ แ€กแ€ฑแ€ฌแ€€แ€บแ€•แ€ซแ€กแ€แ€ญแ€ฏแ€„แ€บแ€ธ แ€แ€ฑแ€ซแ€บแ€šแ€ฐแ€กแ€žแ€ฏแ€ถแ€ธแ€•แ€ผแ€ฏแ€”แ€ญแ€ฏแ€„แ€บแ€•แ€ซแ€žแ€Šแ€บแ‹

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "URajinda/ShweYon-V3-Base"

# แ€™แ€ปแ€€แ€บแ€™แ€พแ€”แ€บแ€›แ€ฑแ€ฌ แ€ฆแ€ธแ€”แ€พแ€ฑแ€ฌแ€€แ€บแ€›แ€ฑแ€ฌ แ€แ€…แ€บแ€แ€ซแ€แ€Šแ€บแ€ธ แ€•แ€ซแ€œแ€ฌแ€•แ€ซแ€™แ€Šแ€บ
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# แ€…แ€™แ€บแ€ธแ€žแ€•แ€บแ€€แ€ผแ€Šแ€ทแ€บแ€›แ€”แ€บ
prompt = "แ€™แ€ผแ€”แ€บแ€™แ€ฌแ€”แ€ญแ€ฏแ€„แ€บแ€„แ€ถแ แ€žแ€™แ€ญแ€ฏแ€„แ€บแ€ธแ€€แ€ผแ€ฑแ€ฌแ€„แ€บแ€ธแ€™แ€พแ€ฌ"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=32)
print(tokenizer.decode(outputs[0]))

โš ๏ธ Note
แ€ค Model แ€žแ€Šแ€บ Base Model แ€žแ€ฌ แ€–แ€ผแ€…แ€บแ€žแ€ฑแ€ฌแ€€แ€ผแ€ฑแ€ฌแ€„แ€ทแ€บ แ€œแ€ฐแ€žแ€ฌแ€ธแ€”แ€พแ€„แ€ทแ€บ แ€…แ€€แ€ฌแ€ธแ€•แ€ผแ€ฑแ€ฌแ€†แ€ญแ€ฏแ€›แ€”แ€บ (Instruction Following) แ€กแ€แ€ฝแ€€แ€บ แ€‘แ€•แ€บแ€™แ€ถแ Chat Fine-tuning แ€œแ€ฏแ€•แ€บแ€›แ€”แ€บ แ€œแ€ญแ€ฏแ€กแ€•แ€บแ€•แ€ซแ€žแ€ฑแ€ธแ€žแ€Šแ€บแ‹

โš–๏ธ License
Apache License 2.0
Downloads last month
88
Safetensors
Model size
2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for URajinda/ShweYon-V3-Base

Finetuned
(310)
this model
Quantizations
2 models