πŸ‡³πŸ‡¬ LLaMA-3-8B Yoruba Chat – GGUF

A Yoruba-first conversational AI fine-tuned from Meta's LLaMA-3-8B Instruct using high-quality multi-turn Yoruba dialogues.

This model is optimized for Yoruba conversation, translation, and cultural context understanding, quantized to GGUF format for efficient deployment with llama.cpp, Ollama, and LM Studio.

🧠 Capabilities

  • βœ… Natural Yoruba conversation with cultural awareness
  • βœ… Yoruba ↔ English translation
  • βœ… Culturally appropriate Yoruba expressions and proverbs
  • βœ… Multi-turn dialogue with context retention
  • βœ… Lightweight GGUF format for CPU/GPU inference

πŸš€ Quick Start

llama.cpp CLI

./llama-cli -hf JohnsonPedia/llama-3-8b-yoruba-chat-gguf \
  -p "αΊΈ kÑàÑrọ̀, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ­ lΓ³nìí?" \
  --jinja

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="JohnsonPedia/llama-3-8b-yoruba-chat-gguf",
    filename="llama-3-8b-instruct.Q4_K_M.gguf",
)

response = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "αΊΈ kÑàÑrọ̀, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ­ lΓ³nìí?"}
    ]
)

print(response["choices"][0]["message"]["content"])

πŸ¦™ Ollama

An Modelfile is included for instant local deployment:

ollama create yoruba-chat -f Modelfile
ollama run yoruba-chat

Then chat:

>>> αΊΈ kÑàÑrọ̀! BΓ‘wo ni?

πŸ—‚ Available Quantizations

File Size Description Use Case
llama-3-8b-instruct.Q4_K_M.gguf ~4.9GB 4-bit quantized Best quality/speed balance

More quantizations (Q5, Q8) coming soon!

πŸ’¬ Example Conversations

English to Yoruba Translation:

User: How do I say "good morning" in Yoruba?
Assistant: "Good morning" in Yoruba is "αΊΈ kÑàÑrọ̀" (formal) or "KÑàÑrọ̀" (casual).

Natural Yoruba Chat:

User: αΊΈ kÑàÑrọ̀, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ­ lΓ³nìí?
Assistant: αΊΈ kÑàÑrọ̀! Mo wΓ  dΓ‘adΓ‘a, αΊΉ αΉ£eΓΊn. Ara mi αΉ£e wΓ  lÑìlΓ©wu. BΓ‘wo ni tirαΊΉ?

⚠️ Important Notes

  • Chat Format Required: This model expects properly formatted chat messages:
  {"role": "user", "content": "Your message here"}

Passing plain strings without chat formatting will cause template errors.

  • Tone Marks: For best results, use proper Yoruba diacritics (αΊΉ, ọ, αΉ£, etc.)

  • BOS Token: The BOS (Beginning of Sequence) token behavior has been modified for GGUF compatibility

πŸ‹οΈ Training Details

  • Base Model: meta-llama/Meta-Llama-3-8B-Instruct
  • Fine-tuning Framework: Unsloth (2Γ— faster training)
  • Dataset: Custom multi-turn Yoruba conversational corpus with cultural context
  • Conversion: GGUF format via llama.cpp for efficient CPU/GPU inference
  • Training Focus: Yoruba fluency, cultural appropriateness, translation accuracy

πŸ“Š Performance

  • Languages: Yoruba (primary), English (secondary)
  • Context Length: 8,192 tokens
  • Recommended Temperature: 0.7-0.9 for creative responses

πŸ”§ Advanced Usage

Custom System Prompts

messages = [
    {"role": "system", "content": "Ìwọ ni olΓΉrΓ nlọ́wọ́ tΓ­ Γ³ nΓ­ Γ¬mọ̀ nΓ­pa Γ αΉ£Γ  YorΓΉbΓ‘."},
    {"role": "user", "content": "KΓ­ ni Γ¬tumọ̀ 'ọmọlΓΊΓ bΓ­'?"}
]

Streaming Responses

for chunk in llm.create_chat_completion(
    messages=messages,
    stream=True
):
    if "content" in chunk["choices"][0]["delta"]:
        print(chunk["choices"][0]["delta"]["content"], end="", flush=True)

🀝 Contributing

Found an issue or want to improve Yoruba language support? Feel free to:

  • Report issues on the Community tab
  • Contribute training data or corrections
  • Share your use cases!

❀️ Acknowledgments

This model was developed as part of the Oduduwa AI project, dedicated to preserving and advancing African languages through AI.

Special thanks to:

  • Unsloth for accelerated training
  • llama.cpp for GGUF conversion tools
  • Meta AI for the LLaMA-3 base model
  • The Yoruba language community for cultural guidance

πŸ“œ License

This model inherits the Llama 3 Community License.


αΊΈ kÑàbọ̀ sΓ­ Oduduwa AI! πŸ‡³πŸ‡¬ Building Intelligence for African Languages

Downloads last month
53
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for JohnsonPedia/llama-3-8b-yoruba-chat-gguf

Quantized
(267)
this model