🇳🇬 LLaMA-3-8B Yoruba Chat – GGUF

A Yoruba-first conversational AI fine-tuned from Meta's LLaMA-3-8B Instruct using high-quality multi-turn Yoruba dialogues.

This model is optimized for Yoruba conversation, translation, and cultural context understanding, quantized to GGUF format for efficient deployment with llama.cpp, Ollama, and LM Studio.

🧠 Capabilities

✅ Natural Yoruba conversation with cultural awareness
✅ Yoruba ↔ English translation
✅ Culturally appropriate Yoruba expressions and proverbs
✅ Multi-turn dialogue with context retention
✅ Lightweight GGUF format for CPU/GPU inference

🚀 Quick Start

llama.cpp CLI

./llama-cli -hf JohnsonPedia/llama-3-8b-yoruba-chat-gguf \
  -p "Ẹ káàárọ̀, báwo ni ara rẹ ṣe rí lónìí?" \
  --jinja

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="JohnsonPedia/llama-3-8b-yoruba-chat-gguf",
    filename="llama-3-8b-instruct.Q4_K_M.gguf",
)

response = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "Ẹ káàárọ̀, báwo ni ara rẹ ṣe rí lónìí?"}
    ]
)

print(response["choices"][0]["message"]["content"])

🦙 Ollama

An Modelfile is included for instant local deployment:

ollama create yoruba-chat -f Modelfile
ollama run yoruba-chat

Then chat:

>>> Ẹ káàárọ̀! Báwo ni?

🗂 Available Quantizations

File	Size	Description	Use Case
`llama-3-8b-instruct.Q4_K_M.gguf`	~4.9GB	4-bit quantized	Best quality/speed balance

More quantizations (Q5, Q8) coming soon!

💬 Example Conversations

English to Yoruba Translation:

User: How do I say "good morning" in Yoruba?
Assistant: "Good morning" in Yoruba is "Ẹ káàárọ̀" (formal) or "Káàárọ̀" (casual).

Natural Yoruba Chat:

User: Ẹ káàárọ̀, báwo ni ara rẹ ṣe rí lónìí?
Assistant: Ẹ káàárọ̀! Mo wà dáadáa, ẹ ṣeún. Ara mi ṣe wà láìléwu. Báwo ni tirẹ?

⚠️ Important Notes

Chat Format Required: This model expects properly formatted chat messages:

  {"role": "user", "content": "Your message here"}

Passing plain strings without chat formatting will cause template errors.

Tone Marks: For best results, use proper Yoruba diacritics (ẹ, ọ, ṣ, etc.)
BOS Token: The BOS (Beginning of Sequence) token behavior has been modified for GGUF compatibility

🏋️ Training Details

Base Model: meta-llama/Meta-Llama-3-8B-Instruct
Fine-tuning Framework: Unsloth (2× faster training)
Dataset: Custom multi-turn Yoruba conversational corpus with cultural context
Conversion: GGUF format via llama.cpp for efficient CPU/GPU inference
Training Focus: Yoruba fluency, cultural appropriateness, translation accuracy

📊 Performance

Languages: Yoruba (primary), English (secondary)
Context Length: 8,192 tokens
Recommended Temperature: 0.7-0.9 for creative responses

🔧 Advanced Usage

Custom System Prompts

messages = [
    {"role": "system", "content": "Ìwọ ni olùrànlọ́wọ́ tí ó ní ìmọ̀ nípa àṣà Yorùbá."},
    {"role": "user", "content": "Kí ni ìtumọ̀ 'ọmọlúàbí'?"}
]

Streaming Responses

for chunk in llm.create_chat_completion(
    messages=messages,
    stream=True
):
    if "content" in chunk["choices"][0]["delta"]:
        print(chunk["choices"][0]["delta"]["content"], end="", flush=True)

🤝 Contributing

Found an issue or want to improve Yoruba language support? Feel free to:

Report issues on the Community tab
Contribute training data or corrections
Share your use cases!

❤️ Acknowledgments

This model was developed as part of the Oduduwa AI project, dedicated to preserving and advancing African languages through AI.

Special thanks to:

Unsloth for accelerated training
llama.cpp for GGUF conversion tools
Meta AI for the LLaMA-3 base model
The Yoruba language community for cultural guidance

📜 License

This model inherits the Llama 3 Community License.

Ẹ káàbọ̀ sí Oduduwa AI! 🇳🇬 Building Intelligence for African Languages

Downloads last month: 53

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit

Model tree for JohnsonPedia/llama-3-8b-yoruba-chat-gguf

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Quantized

(267)

this model