Large Language Models
Multilingual & Low-Resource NLP
Language Model Training & Fine-Tuning
Speech & Audio Models
Dataset Engineering
Efficient AI Systems
Nanochat Moroccan is the first language model family built specifically for Moroccan Darija.
This project brings together a small family of models and datasets centered on Darija, with the goal of building something genuinely useful for a language that is still underserved in AI.
Moroccan Darija is spoken by millions of people, yet it remains underrepresented in language technology. Nanochat Moroccan is a step toward building tools that take the language seriously.