en-hi-brx-deepseek-v3-dense-pt: A 150M Multilingual Model Fine-Tuned for Algorithmic Reasoning

This repository contains a 150M parameter dense DeepseekV3 model pre-trained from scratch on a 3-billion-token multilingual corpus of English, Hindi, and Bodo. The Bodo data was synthetically generated, can be accessed here.

The repository includes several checkpoints, representing a curriculum-based fine-tuning process designed to teach the model complex, multi-step algorithmic reasoning.

Model Checkpoints

The models are organized in a sequential, curriculum-based structure. Each subsequent model was fine-tuned from the previous one, building upon its learned skills.

1. base

The foundation model, pre-trained on a 3-billion-token multilingual corpus. It has a foundational understanding of English, Hindi, and Bodo grammar and vocabulary but possesses limited reasoning capabilities.

  • Training Corpus: C4 (English), mC4 (Hindi), and a custom Bodo corpus (Public + Synthetic).
  • Parameters: ~150M

2. finetuned_ops

This model is an "ops-expert." It was fine-tuned from the base model on a large, focused dataset of five fundamental arithmetic operations (+, -, *, /, %).

  • Training Strategy: This checkpoint was trained using principles from the "grokking" literature, including extended training (40 epochs) and high weight decay (0.1), to force the model to learn a generalizable "circuit" for arithmetic rather than memorizing examples.
  • Capabilities: High accuracy (~95%) on 2-digit arithmetic tasks.

3. finetuned_gcd

This model is a "sequential reasoning expert." It was fine-tuned from the finetuned_ops model to solve for the Greatest Common Divisor (GCD) using the Euclidean algorithm.

  • Training Strategy: Trained on a multilingual, Chain-of-Thought (CoT) dataset. The model learns to generate the intermediate [rem(...)] steps before providing the final answer. The data was balanced by undersampling cases where GCD=1.
  • Capabilities: High accuracy (~95%) in performing multi-step, sequential reasoning for 2-digit GCD problems in English, Hindi, and Bodo.

4. finetuned_bodmas

This is the final and most capable model, designed to be a "hierarchical reasoning expert." It was fine-tuned from the finetuned_gcd model.

  • Training Strategy: Trained on a CoT dataset using an "explicit rewrite" format, where the model must recalculate and rewrite the entire mathematical expression after each operation, correctly following the BODMAS order of operations.
  • Capabilities: Solves multi-step arithmetic expressions involving hierarchical reasoning with reasonable accuracy (~50%).

How to Use

Load a specific fine-tuned model and the tokenizer for inference.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Choose the desired model checkpoint
model_name = "akshitkr/en-hi-brx-deepseek-v3-dense-pt"
# revision can be 'finetuned_ops', 'finetuned_gcd', or 'finetuned_bodmas'
revision_name = "finetuned_gcd" 

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    revision=revision_name,
    torch_dtype=torch.bfloat16 # Use bfloat16 for inference
)

# Example for GCD model
prompt = "|user|what is the gcd of 9 9 and 5 4|user||reasoning|"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)

print(tokenizer.decode(outputs, skip_special_tokens=True))
# Expected output:
# |user|what is the gcd of 9 9 and 5 4|user||reasoning|{[gcd(<9 9>,<5 4>)][rem(<9 9> % <5 4>)=<4 5>]}{[gcd(<5 4>,<4 5>)][rem(<5 4> % <4 5>)=<0 9>]}{[gcd(<4 5>,<0 9>)][rem(<4 5> % <0 9>)=<0 0>]}{[gcd(<0 9>,<0 0>)]}|reasoning||assistant|<0 9>|assistant|
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support