🔥 Overview

qwen3-sa-hi-think is a Sanskrit → Hindi Machine Translation model fine-tuned on top of Qwen3-4B, enhanced with Thinking Tokens to enable structured latent reasoning during translation. The model was trained using LoRA (rank 32) on top of the thinking-enabled variant of Qwen3, allowing it to internally perform step-by-step reasoning before producing the final Hindi output.

Because of its reasoning ability, the model excels in:

Classical Sanskrit poetry (ślokas, Anuṣṭubh meter)
Epic and Purāṇic Sanskrit
Formal Sanskrit prose Semantic interpretation of compounds (samāsa), metaphors, and sandhi-heavy verse leading to significantly more accurate, context-aware Hindi translations.

This model is developed by Pretam Ray (IIT Kharagpur) as part of ongoing research on Indic low-resource MT, explainable translation, and poetry-aware reasoning models.

🧠 Model Architecture

Base Model: Qwen3-4B-Thinking variant (supports reasoning traces via special thinking tokens)
Reasoning Tokens Used: yes (<think> tokens during SFT)
Fine-tuning Method: LoRA
LoRA Rank: 32
LoRA Alpha: alpha=16
Precision: FP16/BF16 mixed precision
Framework: LLaMA-Factory / HF PEFT
Context Length: Inherits Qwen3-4B (32k tokens)
Tokenizer: Qwen3 tokenizer (same as base)

🔍 What are Thinking Tokens?

The underlying Qwen3-4B-Thinking model uses special reasoning tokens (often <think> ... </think>) during training. These guide the model to produce hidden chain-of-thought reasoning internally, improving:

meaning disambiguation
correct handling of long compounds
proper rearrangement of Sanskrit → Hindi syntax
preservation of metaphoric sense

This model suppresses chain-of-thought in final outputs but uses it internally for better translation.

📚 Dataset Composition

Curated Sanskrit → Hindi aligned pairs
Includes:
- Anuṣṭubh epic verse (Ramayana, Mahabharata etc.)
- Traditional ślokas
- Sanskrit prose passages
JSONL format
Training Size: <fill>
Validation Size: <fill>

🔧 Training Configuration

Base Model: Qwen3-4B-Thinking
LoRA Rank: 32
LoRA Dropout: 0.05
LoRA Target Modules: q_proj, k_proj, v_proj, o_proj
Batch Size: <fill>
Learning Rate: <fill>
Epochs: <fill>
Optimizer: AdamW
Scheduler: cosine
Framework: LLaMA-Factory
Reasoning Tokens: Enabled (+latent CoT)

💡 Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "sanganaka/qwen3-4B-sa-hi-think",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("pretamray/qwen3-sa-hi-think")

text = "कोन्वस्मिन्साम्प्रतं लोके गुणवान्कश्च वीर्यवान्"

inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=120)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for sanganaka/qwen3-4B-sa-hi-think

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(577)

this model