🔥 Overview
qwen3-sa-hi-think is a Sanskrit → Hindi Machine Translation model fine-tuned on top of Qwen3-4B, enhanced with Thinking Tokens to enable structured latent reasoning during translation.
The model was trained using LoRA (rank 32) on top of the thinking-enabled variant of Qwen3, allowing it to internally perform step-by-step reasoning before producing the final Hindi output.
Because of its reasoning ability, the model excels in:
- Classical Sanskrit poetry (ślokas, Anuṣṭubh meter)
- Epic and Purāṇic Sanskrit
- Formal Sanskrit prose Semantic interpretation of compounds (samāsa), metaphors, and sandhi-heavy verse leading to significantly more accurate, context-aware Hindi translations.
This model is developed by Pretam Ray (IIT Kharagpur) as part of ongoing research on Indic low-resource MT, explainable translation, and poetry-aware reasoning models.
🧠 Model Architecture
- Base Model: Qwen3-4B-Thinking variant (supports reasoning traces via special thinking tokens)
- Reasoning Tokens Used: yes (
<think>tokens during SFT) - Fine-tuning Method: LoRA
- LoRA Rank: 32
- LoRA Alpha:
alpha=16 - Precision: FP16/BF16 mixed precision
- Framework: LLaMA-Factory / HF PEFT
- Context Length: Inherits Qwen3-4B (
32k tokens) - Tokenizer: Qwen3 tokenizer (same as base)
🔍 What are Thinking Tokens?
The underlying Qwen3-4B-Thinking model uses special reasoning tokens (often <think> ... </think>) during training.
These guide the model to produce hidden chain-of-thought reasoning internally, improving:
- meaning disambiguation
- correct handling of long compounds
- proper rearrangement of Sanskrit → Hindi syntax
- preservation of metaphoric sense
This model suppresses chain-of-thought in final outputs but uses it internally for better translation.
📚 Dataset Composition
Curated Sanskrit → Hindi aligned pairs
Includes:
- Anuṣṭubh epic verse (Ramayana, Mahabharata etc.)
- Traditional ślokas
- Sanskrit prose passages
JSONL format
Training Size:
<fill>Validation Size:
<fill>
🔧 Training Configuration
Base Model: Qwen3-4B-Thinking
LoRA Rank: 32
LoRA Dropout: 0.05
LoRA Target Modules: q_proj, k_proj, v_proj, o_proj
Batch Size: <fill>
Learning Rate: <fill>
Epochs: <fill>
Optimizer: AdamW
Scheduler: cosine
Framework: LLaMA-Factory
Reasoning Tokens: Enabled (+latent CoT)
💡 Example Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"sanganaka/qwen3-4B-sa-hi-think",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("pretamray/qwen3-sa-hi-think")
text = "कोन्वस्मिन्साम्प्रतं लोके गुणवान्कश्च वीर्यवान्"
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=120)
print(tokenizer.decode(out[0], skip_special_tokens=True))
- Downloads last month
- 2