Raziel AI Learning Community

We are thrilled to share a detailed update on Duchifat‑3, the next evolution in our line of language models. While Duchifat‑3 is still under development, this update is designed to provide transparency on what makes it distinct from its predecessor, Duchifat‑2, and what users can expect in the coming weeks.

Design Philosophy

Duchifat‑3 is not just about more parameters or bigger models—it embodies a philosophy of “more suitable, not necessarily larger”. Our goal is to create a model that balances efficiency, multilingual proficiency, and instruction-following capabilities while remaining lightweight and responsive for everyday applications.

Key Improvements Over Duchifat‑2

Optimized Parameterization
Duchifat‑3 contains 31.2 million parameters, fewer than Duchifat‑2. This reduction allows for faster inference and lower memory usage, making it more practical for real-time interactions, yet the architecture has been carefully engineered to maximize expressivity per parameter.
Expanded Training Corpus
The model is being trained on 14.7 billion tokens from C4, split evenly between Hebrew and English. This represents a significant increase in data exposure compared to Duchifat‑2, providing stronger multilingual understanding, context retention, and nuanced language generation.
Modernized Architecture
We are leveraging a contemporary transformer design with improvements such as:
- Enhanced RMSNorm layers for stable gradient flow and more consistent training dynamics.
- An upgraded RoPE positional embedding mechanism, enabling more accurate long-range attention and sequence handling.
- Optimized attention head configurations for efficiency without sacrificing model capacity.
Instruction-Focused Behavior
Duchifat‑3 is being fine-tuned to follow short, actionable instructions more effectively. While Duchifat‑2 excelled at content generation, long-form text, and knowledge retrieval, Duchifat‑3 is optimized for:
- Chatbot interactions
- Quick question answering
- Short, structured reasoning tasks
  This focus ensures that the model is immediately practical for a wide range of real-world applications.
Advanced Tokenization with Dicta
Utilizing the latest Dicta tokenizer, Duchifat‑3 achieves superior handling of Hebrew-English mixed content, rare tokens, and edge cases. This leads to more accurate token-level predictions and smoother output across languages.

Why Smaller Can Be Smarter

A larger model is not always better. Duchifat‑3 demonstrates that careful architecture design, massive high-quality training data, and targeted instruction fine-tuning can produce a model that is more suitable, faster, and robust for practical applications—even with fewer parameters. This is a key insight we hope the community appreciates and experiments with.

Community Engagement and Preview

We intend to release early preview checkpoints soon. Your feedback will be instrumental in refining Duchifat‑3 for real-world use. By engaging early, the community can influence:

Fine-tuning strategies
Instruction-following behavior
Output quality in mixed-language contexts

Looking Forward

Duchifat‑3 represents our commitment to accessible, high-performance language models that can serve both researchers and practitioners. We believe in co-creating with the community, learning from early feedback, and iterating rapidly to deliver a model that is both technically strong and highly usable.

Stay tuned for upcoming announcements regarding the Duchifat‑3 preview release. We can’t wait to share this journey with all of you and continue building a thriving community around high-quality multilingual language models. 🌟

— Raziel AI Learning