π³π¬ LLaMA-3-8B Yoruba Chat β GGUF
A Yoruba-first conversational AI fine-tuned from Meta's LLaMA-3-8B Instruct using high-quality multi-turn Yoruba dialogues.
This model is optimized for Yoruba conversation, translation, and cultural context understanding, quantized to GGUF format for efficient deployment with llama.cpp, Ollama, and LM Studio.
π§ Capabilities
- β Natural Yoruba conversation with cultural awareness
- β Yoruba β English translation
- β Culturally appropriate Yoruba expressions and proverbs
- β Multi-turn dialogue with context retention
- β Lightweight GGUF format for CPU/GPU inference
π Quick Start
llama.cpp CLI
./llama-cli -hf JohnsonPedia/llama-3-8b-yoruba-chat-gguf \
-p "αΊΈ kÑà Ñrα»Μ, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ lΓ³nΓ¬Γ?" \
--jinja
Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="JohnsonPedia/llama-3-8b-yoruba-chat-gguf",
filename="llama-3-8b-instruct.Q4_K_M.gguf",
)
response = llm.create_chat_completion(
messages=[
{"role": "user", "content": "αΊΈ kÑà Ñrα»Μ, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ lΓ³nΓ¬Γ?"}
]
)
print(response["choices"][0]["message"]["content"])
π¦ Ollama
An Modelfile is included for instant local deployment:
ollama create yoruba-chat -f Modelfile
ollama run yoruba-chat
Then chat:
>>> αΊΈ kÑà Ñrα»Μ! BΓ‘wo ni?
π Available Quantizations
| File | Size | Description | Use Case |
|---|---|---|---|
llama-3-8b-instruct.Q4_K_M.gguf |
~4.9GB | 4-bit quantized | Best quality/speed balance |
More quantizations (Q5, Q8) coming soon!
π¬ Example Conversations
English to Yoruba Translation:
User: How do I say "good morning" in Yoruba?
Assistant: "Good morning" in Yoruba is "αΊΈ kÑà Ñrα»Μ" (formal) or "KÑà Ñrα»Μ" (casual).
Natural Yoruba Chat:
User: αΊΈ kÑà Ñrα»Μ, bΓ‘wo ni ara rαΊΉ αΉ£e rΓ lΓ³nΓ¬Γ?
Assistant: αΊΈ kÑà Ñrα»Μ! Mo wΓ dΓ‘adΓ‘a, αΊΉ αΉ£eΓΊn. Ara mi αΉ£e wΓ lÑìlΓ©wu. BΓ‘wo ni tirαΊΉ?
β οΈ Important Notes
- Chat Format Required: This model expects properly formatted chat messages:
{"role": "user", "content": "Your message here"}
Passing plain strings without chat formatting will cause template errors.
Tone Marks: For best results, use proper Yoruba diacritics (αΊΉ, α», αΉ£, etc.)
BOS Token: The BOS (Beginning of Sequence) token behavior has been modified for GGUF compatibility
ποΈ Training Details
- Base Model: meta-llama/Meta-Llama-3-8B-Instruct
- Fine-tuning Framework: Unsloth (2Γ faster training)
- Dataset: Custom multi-turn Yoruba conversational corpus with cultural context
- Conversion: GGUF format via llama.cpp for efficient CPU/GPU inference
- Training Focus: Yoruba fluency, cultural appropriateness, translation accuracy
π Performance
- Languages: Yoruba (primary), English (secondary)
- Context Length: 8,192 tokens
- Recommended Temperature: 0.7-0.9 for creative responses
π§ Advanced Usage
Custom System Prompts
messages = [
{"role": "system", "content": "Γwα» ni olΓΉrΓ nlα»Μwα»Μ tΓ Γ³ nΓ Γ¬mα»Μ nΓpa Γ αΉ£Γ YorΓΉbΓ‘."},
{"role": "user", "content": "KΓ ni Γ¬tumα»Μ 'α»mα»lΓΊΓ bΓ'?"}
]
Streaming Responses
for chunk in llm.create_chat_completion(
messages=messages,
stream=True
):
if "content" in chunk["choices"][0]["delta"]:
print(chunk["choices"][0]["delta"]["content"], end="", flush=True)
π€ Contributing
Found an issue or want to improve Yoruba language support? Feel free to:
- Report issues on the Community tab
- Contribute training data or corrections
- Share your use cases!
β€οΈ Acknowledgments
This model was developed as part of the Oduduwa AI project, dedicated to preserving and advancing African languages through AI.
Special thanks to:
- Unsloth for accelerated training
- llama.cpp for GGUF conversion tools
- Meta AI for the LLaMA-3 base model
- The Yoruba language community for cultural guidance
π License
This model inherits the Llama 3 Community License.
αΊΈ kÑà bα»Μ sΓ Oduduwa AI! π³π¬ Building Intelligence for African Languages
- Downloads last month
- 53
4-bit
Model tree for JohnsonPedia/llama-3-8b-yoruba-chat-gguf
Base model
meta-llama/Meta-Llama-3-8B-Instruct