๐Ÿ‡น๐Ÿ‡ญ OpenThaiGPT 7b 1.0.0

OpenThaiGPT
More Info

๐Ÿ‡น๐Ÿ‡ญ OpenThaiGPT 7b Version 1.0.0 is an advanced 7-billion-parameter Thai language chat model based on LLaMA v2 released on April 8, 2024. It has been specifically fine-tuned for Thai instructions and enhanced by incorporating over 10,000 of the most commonly used Thai words into the large language model's (LLM) dictionary, significantly boosting its response speed.

Highlights

  • Leading-edge Thai language LLM, setting new benchmarks by achieving the highest average scores across several Thai language exams when compared to all other open-source Thai LLMs.
  • The First 70b Thai opensource LLM, achieving the higher score on Thai exams than OpenAI GPT 3.5, Google Gemini, and Claude 3 Haiku.
  • Support for extended conversations across multiple turns.
  • Support the use case of Retrieval Augmented Generation (RAG) for enriched response generation.
  • Generation speeds increased by tenfold, thanks to the addition of 10,000 frequently used Thai words to the model's dictionary.
  • Pretrained upon a foundation of more than 65 billion Thai language words and meticulously fine-tuned with over 1 million Thai instruction examples.
  • Capable of understanding and processing input contexts of up to 4096 Thai words, allowing for detailed and complex instructions.

Benchmark by OpenThaiGPT Eval

** Please take a look at OTG 7b (April 2024) for this model's evaluation result.

Exams OTG 7b (Aug 2023) OTG 13b (Dec 2023) OTG 7b (April 2024) OTG 13b (April 2024) OTG 70b (April 2024) SeaLLM 7b v1 SeaLLM 7b v2 SeaLion 7b WanchanGLM 7b Sailor-7b-Chat TyphoonGPT 7b Instruct GPT3.5 GPT4 Gemini Pro Gemini 1.5 Claude 3 Haiku Claude 3 Sonnet Claude 3 Opus
A-Level 17.50% 34.17% 25.00% 30.83% 45.83% 18.33% 34.17% 21.67% 17.50% 40.00% 37.50% 38.33% 65.83% 56.67% 55.83% 58.33% 59.17% 77.50%
TGAT 24.00% 22.00% 22.00% 36.00% 36.00% 14.00% 28.00% 24.00% 16.00% 34.00% 30.00% 28.00% 44.00% 22.00% 28.00% 36.00% 34.00% 46.00%
TPAT1 22.50% 47.50% 42.50% 27.50% 62.50% 22.50% 27.50% 22.50% 17.50% 40.00% 47.50% 45.00% 52.50% 52.50% 50.00% 52.50% 50.00% 62.50%
thai_investment_consultant_exams 8.00% 28.00% 76.00% 84.00% 68.00% 16.00% 28.00% 24.00% 16.00% 24.00% 32.00% 40.00% 64.00% 52.00% 32.00% 44.00% 64.00% 72.00%
facebook_beleble_tha_200 25.00% 45.00% 34.50% 39.50% 70.00% 13.50% 51.00% 27.00% 24.50% 63.00% 51.50% 50.00% 72.50% 65.00% 74.00% 63.50% 77.00% 90.00%
xcopa_th_200 45.00% 56.50% 49.50% 51.50% 74.50% 26.50% 47.00% 51.50% 48.50% 68.50% 65.00% 64.00% 82.00% 68.00% 74.00% 64.00% 80.00% 86.00%
xnli2.0_th_200 33.50% 34.50% 39.50% 31.00% 47.00% 21.00% 43.00% 37.50% 33.50% 16.00% 20.00% 50.00% 69.00% 53.00% 54.50% 50.00% 68.00% 68.50%
ONET M3 17.85% 38.86% 34.11% 39.36% 56.15% 15.58% 23.92% 21.79% 19.56% 21.37% 28.03% 37.91% 49.97% 55.99% 57.41% 52.73% 40.60% 63.87%
ONET M6 21.14% 28.87% 22.53% 23.32% 42.85% 15.09% 19.48% 16.96% 20.67% 28.64% 27.46% 34.44% 46.29% 45.53% 50.23% 34.79% 38.49% 48.56%
AVERAGE SCORE 23.83% 37.27% 38.40% 40.33% 55.87% 18.06% 33.56% 27.44% 23.75% 37.28% 37.67% 43.07% 60.68% 52.30% 52.89% 50.65% 56.81% 68.32%
Thai language multiple choice exams, Test on unseen test sets, Zero-shot learning. Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval

(Updated on: 7 April 2024)

Benchmark on M3Exam evaluated by an external party (Float16.cloud)

Models ENGLISH (M3EXAM) THAI (M3EXAM)
OTG-7b 40.92 % 25.14 %
OTG-13b 53.69 % 36.49 %
OTG-70b 72.58 %< 48.29 %
GPT-3.5-turbo-0613* - 34.1 %
GPT-4-0613* - 56.0 %
More information: https://blog.float16.cloud/the-first-70b-thai-llm/

Licenses

Source Code: License Apache Software License 2.0.
Weight: Research and Commercial uses.

Sponsors

Supports

Prompt Format

Prompt format is based on Llama2 with a small modification (Adding "###" to specify the context part)

<s>[INST] <<SYS>
{system_prompt}
<</SYS>>

{human_turn1}###{context_turn1} [/INST]{assistant_turn1}</s><s>{human_turn2}###{context_turn2} [/INST] ...

System prompt:

You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”

Examples

Single Turn Conversation Example

<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš [/INST]

Single Turn Conversation with Context (RAG) Example

<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธเธฃเธธเธ‡เน€เธ—เธžเธกเธตเธžเธทเน‰เธ™เธ—เธตเนˆเน€เธ—เนˆเธฒเน„เธฃเนˆ###เธเธฃเธธเธ‡เน€เธ—เธžเธกเธซเธฒเธ™เธ„เธฃ เน€เธ›เน‡เธ™เน€เธกเธทเธญเธ‡เธซเธฅเธงเธ‡ เธ™เธ„เธฃเนเธฅเธฐเธกเธซเธฒเธ™เธ„เธฃเธ—เธตเนˆเธกเธตเธ›เธฃเธฐเธŠเธฒเธเธฃเธกเธฒเธเธ—เธตเนˆเธชเธธเธ”เธ‚เธญเธ‡เธ›เธฃเธฐเน€เธ—เธจเน„เธ—เธข เธเธฃเธธเธ‡เน€เธ—เธžเธกเธซเธฒเธ™เธ„เธฃเธกเธตเธžเธทเน‰เธ™เธ—เธตเนˆเธ—เธฑเน‰เธ‡เธซเธกเธ” 1,568.737 เธ•เธฃ.เธเธก. เธกเธตเธ›เธฃเธฐเธŠเธฒเธเธฃเธ•เธฒเธกเธ—เธฐเน€เธšเธตเธขเธ™เธฃเธฒเธฉเธŽเธฃเธเธงเนˆเธฒ 8 เธฅเน‰เธฒเธ™เธ„เธ™ [/INST]

Multi Turn Conversation Example

First turn
<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš [/INST]
Second turn
<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš [/INST]เธชเธงเธฑเธชเธ”เธตเธ„เนˆเธฐ เธกเธตเธ„เธณเธ–เธฒเธกเธญเธฐเน„เธฃ เธ–เธฒเธกเน„เธ”เน‰เน€เธฅเธข</s><s>เธ‚เธญเธชเธนเธ•เธฃเธ—เธณเธชเน‰เธกเธ•เธณเธซเธ™เนˆเธญเธข [/INST]
Third turn
<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš [/INST]เธชเธงเธฑเธชเธ”เธตเธ„เนˆเธฐ เธกเธตเธ„เธณเธ–เธฒเธกเธญเธฐเน„เธฃ เธ–เธฒเธกเน„เธ”เน‰เน€เธฅเธข</s><s>เธ‚เธญเธชเธนเธ•เธฃเธ—เธณเธชเน‰เธกเธ•เธณเธซเธ™เนˆเธญเธข [/INST]เน„เธ”เน‰เน€เธฅเธขเธ„เนˆเธฐ เธชเน‰เธกเธ•เธณเน€เธ›เน‡เธ™เน€เธกเธ™เธนเธ—เธตเนˆเธ—เธณเธ‡เนˆเธฒเธขเนเธฅเธฐเธญเธฃเนˆเธญเธข เธกเธฒเน€เธฃเธดเนˆเธกเธเธฑเธ™เน€เธฅเธขเธ™เธฐเธ„เธฐ</s><s>เน€เธฃเธดเนˆเธกเน„เธ”เน‰เน€เธฅเธข [/INST]
Fourth turn
<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš [/INST]เธชเธงเธฑเธชเธ”เธตเธ„เนˆเธฐ เธกเธตเธ„เธณเธ–เธฒเธกเธญเธฐเน„เธฃ เธ–เธฒเธกเน„เธ”เน‰เน€เธฅเธข</s><s>เธ‚เธญเธชเธนเธ•เธฃเธ—เธณเธชเน‰เธกเธ•เธณเธซเธ™เนˆเธญเธข [/INST]เน„เธ”เน‰เน€เธฅเธขเธ„เนˆเธฐ เธชเน‰เธกเธ•เธณเน€เธ›เน‡เธ™เน€เธกเธ™เธนเธ—เธตเนˆเธ—เธณเธ‡เนˆเธฒเธขเนเธฅเธฐเธญเธฃเนˆเธญเธข เธกเธฒเน€เธฃเธดเนˆเธกเธเธฑเธ™เน€เธฅเธขเธ™เธฐเธ„เธฐ</s><s>เน€เธฃเธดเนˆเธกเน„เธ”เน‰เน€เธฅเธข [/INST]
เธ•เธณเน„เธ—เธขเน€เธ›เน‡เธ™เน€เธกเธ™เธนเธ—เธตเนˆเนƒเธ„เธฃเน† เธเน‡เธ„เธธเน‰เธ™เน€เธ„เธข เธ”เน‰เธงเธขเธฃเธชเธŠเธฒเธ•เธดเธ—เธตเนˆเธเธฅเธกเธเธฅเนˆเธญเธก เธซเธงเธฒเธ™ เน€เธ›เธฃเธตเน‰เธขเธง เน€เธ„เน‡เธก เน€เธœเน‡เธ” เธ„เธฃเธšเธฃเธช เธงเธฑเธ™เธ™เธตเน‰เน€เธฃเธฒเธˆเธฐเธกเธฒเธ—เธณเธชเน‰เธกเธ•เธณเธ”เน‰เธงเธขเธเธฑเธ™เธ„เนˆเธฐ เธชเธดเนˆเธ‡เนเธฃเธเธ—เธตเนˆเธ•เน‰เธญเธ‡เธกเธตเธ„เธทเธญเน€เธ„เธฃเธทเนˆเธญเธ‡เธ›เธฃเธธเธ‡เธ”เธฑเธ‡เธ™เธตเน‰
- เธกเธฐเธฅเธฐเธเธญเธ”เธดเธš 1 เธฅเธนเธ
- เธเธธเน‰เธ‡เนเธซเน‰เธ‡ 1/2 เธ–เน‰เธงเธข
- เธเธฃเธฐเน€เธ—เธตเธขเธก 3 เธเธฅเธตเธš
- เธžเธฃเธดเธเธ‚เธตเน‰เธซเธ™เธน 3 เน€เธกเน‡เธ”
- เธ™เน‰เธณเธ•เธฒเธฅเธ›เธตเนŠเธš 1 เธŠเน‰เธญเธ™เน‚เธ•เนŠเธฐ
- เธ™เน‰เธณเธ›เธฅเธฒ 2 เธŠเน‰เธญเธ™เน‚เธ•เนŠเธฐ
- เธกเธฐเธ™เธฒเธง 1 เธฅเธนเธ
- เธ‡เธฒเธ‚เธฒเธงเธ„เธฑเนˆเธง 1/4 เธ–เน‰เธงเธข

เธงเธดเธ˜เธตเธ—เธณเธกเธตเธ”เธฑเธ‡เธ™เธตเน‰เธ„เนˆเธฐ
1. เน€เธฃเธดเนˆเธกเธˆเธฒเธเธฅเน‰เธฒเธ‡เธกเธฐเธฅเธฐเธเธญเนƒเธซเน‰เธชเธฐเธญเธฒเธ” เนเธฅเน‰เธงเนƒเธŠเน‰เธกเธตเธ”เธ›เธญเธเน€เธ›เธฅเธทเธญเธ เน€เธญเธฒเน„เธชเน‰เธญเธญเธ เธซเธฑเนˆเธ™เน€เธ›เน‡เธ™เน€เธชเน‰เธ™เธšเธฒเธ‡เน† เน€เธ•เธฃเธตเธขเธกเน„เธงเน‰
2. เธ•เธณเธเธธเน‰เธ‡เนเธซเน‰เธ‡เนƒเธซเน‰เธฅเธฐเน€เธญเธตเธขเธ” เนเธฅเน‰เธงเธ•เธฑเธเธ‚เธถเน‰เธ™เธžเธฑเธเน„เธงเน‰
3. เนƒเธŠเน‰เธ„เธฃเธเธซเธดเธ™เธซเธฃเธทเธญเน€เธ„เธฃเธทเนˆเธญเธ‡เธ›เธฑเนˆเธ™ เธšเธ”เธžเธฃเธดเธเธ‚เธตเน‰เธซเธ™เธนเธเธฑเธšเธเธฃเธฐเน€เธ—เธตเธขเธกเนƒเธซเน‰เธฅเธฐเน€เธญเธตเธขเธ”
4. เนƒเธชเนˆเธเธธเน‰เธ‡เนเธซเน‰เธ‡เธ—เธตเนˆเธ•เธณเนเธฅเน‰เธงเธฅเธ‡เน„เธ›เธœเธชเธก เธ•เธฒเธกเธ”เน‰เธงเธขเธ™เน‰เธณเธ•เธฒเธฅเธ›เธตเนŠเธš เธ™เน‰เธณเธ›เธฅเธฒ เธกเธฐเธ™เธฒเธง เนเธฅเธฐเน€เธชเน‰เธ™เธกเธฐเธฅเธฐเธเธญ เธ„เธฅเธธเธเน€เธ„เธฅเน‰เธฒเนƒเธซเน‰เน€เธ‚เน‰เธฒเธเธฑเธ™
5. เธŠเธดเธกเธฃเธชเนƒเธซเน‰เน„เธ”เน‰เธฃเธชเธซเธงเธฒเธ™ เน€เธ›เธฃเธตเน‰เธขเธง เน€เธ„เน‡เธก เน€เธœเน‡เธ” เธˆเธฒเธเธ™เธฑเน‰เธ™เธ•เธฑเธเธ‚เธถเน‰เธ™เน€เธชเธดเธฃเนŒเธŸเธžเธฃเน‰เธญเธกเธœเธฑเธเธชเธ” เธญเธฒเธ—เธด เธเธฐเธซเธฅเนˆเธณเธ›เธฅเธต เธ–เธฑเนˆเธงเธ‡เธญเธ เนเธ„เธฃเธญเธ— เธœเธฑเธเธšเธธเน‰เธ‡</s><s>เธ‚เธญเธšเธ„เธธเธ“เธ„เธฃเธฑเธš [/INST]

Multi Turn Conversation with Context (RAG) Example

<s>[INST] <<SYS>
You are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”
<</SYS>>

เธเธฃเธธเธ‡เน€เธ—เธžเธกเธตเธžเธทเน‰เธ™เธ—เธตเนˆเน€เธ—เนˆเธฒเน„เธฃเนˆ###เธเธฃเธธเธ‡เน€เธ—เธžเธกเธซเธฒเธ™เธ„เธฃ เน€เธ›เน‡เธ™เน€เธกเธทเธญเธ‡เธซเธฅเธงเธ‡ เธ™เธ„เธฃเนเธฅเธฐเธกเธซเธฒเธ™เธ„เธฃเธ—เธตเนˆเธกเธตเธ›เธฃเธฐเธŠเธฒเธเธฃเธกเธฒเธเธ—เธตเนˆเธชเธธเธ”เธ‚เธญเธ‡เธ›เธฃเธฐเน€เธ—เธจเน„เธ—เธข เธเธฃเธธเธ‡เน€เธ—เธžเธกเธซเธฒเธ™เธ„เธฃเธกเธตเธžเธทเน‰เธ™เธ—เธตเนˆเธ—เธฑเน‰เธ‡เธซเธกเธ” 1,568.737 เธ•เธฃ.เธเธก. เธกเธตเธ›เธฃเธฐเธŠเธฒเธเธฃเธ•เธฒเธกเธ—เธฐเน€เธšเธตเธขเธ™เธฃเธฒเธฉเธŽเธฃเธเธงเนˆเธฒ 8 เธฅเน‰เธฒเธ™เธ„เธ™ [/INST]
เธเธฃเธธเธ‡เน€เธ—เธžเธกเธซเธฒเธ™เธ„เธฃเธกเธตเธžเธทเน‰เธ™เธ—เธตเนˆเธ—เธฑเน‰เธ‡เธซเธกเธ” 1,568.737 เธ•เธฃ.เธเธก.</s><s>เนเธฅเธฐเธ›เธฃเธฐเธŠเธฒเธเธฃเธฅเนˆเธฐ [/INST]

How to use

Huggingface

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Ensure CUDA is available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Init Model
model_path="openthaigpt/openthaigpt-1.0.0-7b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.float16)
model.to(device)

# Prompt
prompt = "เธชเธงเธฑเธชเธ”เธตเธ„เธฃเธฑเธš OpenThaiGPT"
llama_prompt = f"<s>[INST] <<SYS>>\nYou are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”<</SYS>>\n\n{prompt} [/INST]"
inputs = tokenizer.encode(llama_prompt, return_tensors="pt")
inputs = inputs.to(device)

# Generate
outputs = model.generate(inputs, max_length=512, num_return_sequences=1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM

  1. Install VLLM (https://github.com/vllm-project/vllm)

  2. Run server

python -m vllm.entrypoints.api_server --model /path/to/model --tensor-parallel-size num_gpus
  1. Run inference (CURL example)
curl --request POST \
    --url http://localhost:8000/generate \
    --header "Content-Type: application/json" \
    --data '{"prompt": "<s>[INST] <<SYS>>\nYou are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ”\n<</SYS>>\n\nเธญเธขเธฒเธเธฅเธ”เธ„เธงเธฒเธกเธญเน‰เธงเธ™เธ•เน‰เธญเธ‡เธ—เธณเธญเธขเนˆเธฒเธ‡เน„เธฃ [/INST]","use_beam_search": false, "temperature": 0.1, "max_tokens": 512, "top_p": 0.75, "top_k": 40, "frequency_penalty": 0.3 "stop": "</s>"}'

LlamaCPP (for GGUF)

  1. Build and Install LlamaCPP (LLAMA_CUBLAS=1 is for GPU inference)
git clone https://github.com/ggerganov/llama.cpp.git \
  && cd llama.cpp \
  && make -j LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=all
  1. Run server
./server -m /path/to/ggml-model-f16.gguf -c 3072 -ngl 81 -ts 1,1 --host 0.0.0.0
  1. Run inference (CURL example)
curl --location 'http://localhost:8000/completion' \
--header 'Content-Type: application/json' \
--data '{
    "prompt":"<s>[INST] <<SYS>>\nYou are a question answering assistant. Answer the question as truthful and helpful as possible เธ„เธธเธ“เธ„เธทเธญเธœเธนเน‰เธŠเนˆเธงเธขเธ•เธญเธšเธ„เธณเธ–เธฒเธก เธˆเธ‡เธ•เธญเธšเธ„เธณเธ–เธฒเธกเธญเธขเนˆเธฒเธ‡เธ–เธนเธเธ•เน‰เธญเธ‡เนเธฅเธฐเธกเธตเธ›เธฃเธฐเน‚เธขเธŠเธ™เนŒเธ—เธตเนˆเธชเธธเธ” friendly\n\n<<SYS>>\n\nเธญเธขเธฒเธเธฅเธ”เธ„เธงเธฒเธกเธญเน‰เธงเธ™เธ•เน‰เธญเธ‡เธ—เธณเธญเธขเนˆเธฒเธ‡เน„เธฃ [/INST]",
    "max_tokens": 512,
    "stop":"</s>"
}'

Ollama

There are two ways to run on ollama

  1. From this repo Modelfile and 4 bit quantized gguf
ollama create -f ./Modelfile
  1. From Ollama CLI
ollama run pacozaa/openthaigpt

GPU Memory Requirements

Number of Parameters FP 16 bits 8 bits (Quantized) 4 bits (Quantized) Example Graphic Card for 4 bits
7b 24 GB 12 GB 6 GB Nvidia RTX 4060 8GB
13b 48 GB 24 GB 12 GB Nvidia RTX 4070 16GB
70b 192 GB 96 GB 48 GB Nvidia RTX 4090 24GB x 2 cards

OpenThaiGPT Team

Citation

If OpenThaiGPT has been beneficial for your work, kindly consider citing it as follows:

Bibtex

@misc{yuenyong2024openthaigpt15thaicentricopen,
      title={OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model}, 
      author={Sumeth Yuenyong and Kobkrit Viriyayudhakorn and Apivadee Piyatumrong and Jillaphat Jaroenkantasima},
      year={2024},
      eprint={2411.07238},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.07238}, 
}

APA Style (for TXT, MS Word)

Yuenyong, S., Viriyayudhakorn, K., Piyatumrong, A., & Jaroenkantasima, J. (2024). OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model. arXiv [Cs.CL]. Retrieved from http://arxiv.org/abs/2411.07238

Disclaimer: Provided responses are not guaranteed.

Downloads last month
1,501
Safetensors
Model size
7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for openthaigpt/openthaigpt-1.0.0-7b-chat

Adapters
10 models
Finetunes
9 models
Quantizations
4 models

Space using openthaigpt/openthaigpt-1.0.0-7b-chat 1

Collection including openthaigpt/openthaigpt-1.0.0-7b-chat

Paper for openthaigpt/openthaigpt-1.0.0-7b-chat