-
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Paper • 2511.01937 • Published • 16 -
MBZUAI-Paris/Frugal-Thinking-30B-A3B
Text Generation • 31B • Updated • 1 • 2 -
MBZUAI-Paris/Frugal-Thinking-4B
Text Generation • 4B • Updated • 6 • 7 -
MBZUAI-Paris/Frugal-Thinking-RL-Data
Viewer • Updated • 177k • 5 • 4
Abdelaziz Bounhar
BounharAbdelaziz
AI & ML interests
Deep Learning, Reinforcement Learning, AI Agents, Generative Modeling, NLP, Information Theory, Security of Machine Learning, ...etc
Recent Activity
liked a dataset about 11 hours ago
TeraflopAI/SEC-EDGAR published a dataset 8 days ago
BounharAbdelaziz/Morocco-Darija-Speech-35h-Fixed published a dataset 9 days ago
BounharAbdelaziz/No-Arabic-Dialect-Left-Behind-isoOrganizations
Moroccan Darija LLMs
Language Models that speaks Moroccan darija (ary)
Moroccan Speech Models & Datasets
Moroccan darija STT
-
atlasia/DODa-audio-dataset
Viewer • Updated • 12.7k • 581 • 18 -
atlasia/Moroccan-Darija-Wiki-Audio-Dataset
Viewer • Updated • 492 • 92 • 14 -
BounharAbdelaziz/Dvoice-v2-cleaned
Viewer • Updated • 120 • 17 -
BounharAbdelaziz/Morocco-Darija-Speech-35h-Fixed
Viewer • Updated • 21.7k • 5
Translation Models & Datasets
English to Moroccan darija (ary) models
-
BounharAbdelaziz/Terjman-v2-English-Morocco-Darija-Dataset-350K
Viewer • Updated • 355k • 10 • 1 -
BounharAbdelaziz/Terjman-Supreme-v2.0
Translation • 3B • Updated • 2 -
BounharAbdelaziz/Terjman-Ultra-v2.0
Translation • 1B • Updated • 19 • 2 -
BounharAbdelaziz/Terjman-Large-v2.0
Translation • 0.2B • Updated • 2
Arabic (MSA) Summarization Models & Datasets
A collection of models (and the dataset used to train them) that are trained for summarizing arabic text.
-
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct
Text Generation • 3B • Updated • 1 -
BounharAbdelaziz/MaYofid-Falcon3-3B-Instruct
Text Generation • 3B • Updated • 3 -
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct-AWQ
3B • Updated • 2 -
BounharAbdelaziz/Arabic-Synthetic-Summarization-Dataset-Filtered
Viewer • Updated • 4.41k • 40 • 1
RLHF/RLVR
Some RLHF/RLVR experiments using GRPO and DPO.
Moroccan Darija Embeddings Models & Datasets
Sentence and word embedding models for Moroccan darija (ary)
-
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.2
Sentence Similarity • 0.6B • Updated • 2 -
BounharAbdelaziz/ModernBERT-Morocco-Sentence-Embeddings-v0.2-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
Sentence Similarity • 0.2B • Updated • 1 -
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.1
Feature Extraction • 0.6B • Updated • 2 • 2 -
BounharAbdelaziz/XLM-RoBERTa-Morocco-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
0.6B • Updated • 8
Moroccan Darija Datasets
A collection of all available datasets for pretraining LLMs
Arabic (MSA) Language Models & Datasets
Frugal-AI
-
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Paper • 2511.01937 • Published • 16 -
MBZUAI-Paris/Frugal-Thinking-30B-A3B
Text Generation • 31B • Updated • 1 • 2 -
MBZUAI-Paris/Frugal-Thinking-4B
Text Generation • 4B • Updated • 6 • 7 -
MBZUAI-Paris/Frugal-Thinking-RL-Data
Viewer • Updated • 177k • 5 • 4
RLHF/RLVR
Some RLHF/RLVR experiments using GRPO and DPO.
Moroccan Darija LLMs
Language Models that speaks Moroccan darija (ary)
Moroccan Darija Embeddings Models & Datasets
Sentence and word embedding models for Moroccan darija (ary)
-
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.2
Sentence Similarity • 0.6B • Updated • 2 -
BounharAbdelaziz/ModernBERT-Morocco-Sentence-Embeddings-v0.2-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
Sentence Similarity • 0.2B • Updated • 1 -
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.1
Feature Extraction • 0.6B • Updated • 2 • 2 -
BounharAbdelaziz/XLM-RoBERTa-Morocco-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
0.6B • Updated • 8
Moroccan Speech Models & Datasets
Moroccan darija STT
-
atlasia/DODa-audio-dataset
Viewer • Updated • 12.7k • 581 • 18 -
atlasia/Moroccan-Darija-Wiki-Audio-Dataset
Viewer • Updated • 492 • 92 • 14 -
BounharAbdelaziz/Dvoice-v2-cleaned
Viewer • Updated • 120 • 17 -
BounharAbdelaziz/Morocco-Darija-Speech-35h-Fixed
Viewer • Updated • 21.7k • 5
Moroccan Darija Datasets
A collection of all available datasets for pretraining LLMs
Translation Models & Datasets
English to Moroccan darija (ary) models
-
BounharAbdelaziz/Terjman-v2-English-Morocco-Darija-Dataset-350K
Viewer • Updated • 355k • 10 • 1 -
BounharAbdelaziz/Terjman-Supreme-v2.0
Translation • 3B • Updated • 2 -
BounharAbdelaziz/Terjman-Ultra-v2.0
Translation • 1B • Updated • 19 • 2 -
BounharAbdelaziz/Terjman-Large-v2.0
Translation • 0.2B • Updated • 2
Arabic (MSA) Language Models & Datasets
Arabic (MSA) Summarization Models & Datasets
A collection of models (and the dataset used to train them) that are trained for summarizing arabic text.
-
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct
Text Generation • 3B • Updated • 1 -
BounharAbdelaziz/MaYofid-Falcon3-3B-Instruct
Text Generation • 3B • Updated • 3 -
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct-AWQ
3B • Updated • 2 -
BounharAbdelaziz/Arabic-Synthetic-Summarization-Dataset-Filtered
Viewer • Updated • 4.41k • 40 • 1