sanskrit-translation

This repository contains the exported checkpoint and runtime files for the Sanskrit IAST -> Devanagari D3PM cross-attention model.

Included Files

best_model.pt: primary demo checkpoint
best_val_model.pt: best validation-loss checkpoint (if available)
quality_predictor.pt: Task 5 quality predictor (if available)
project_config.json: serialized training/inference config
sanskrit_src_tokenizer_v1000.json
sanskrit_tgt_tokenizer_v2000.json
inference.py
models/
diffusion/

Local Load Example

git clone https://huggingface.co/bhsinghgrid/sanskrit-translation
cd sanskrit-translation
python inference.py --model best_model.pt --cli

Python Download Example

from huggingface_hub import snapshot_download
repo_dir = snapshot_download("bhsinghgrid/sanskrit-translation")
print("Downloaded to:", repo_dir)

Transformer-Style Usage (Custom Runtime)

This model is custom D3PM architecture, so load it with the included runtime:

import os
import torch
from huggingface_hub import snapshot_download
from config import CONFIG
from inference import load_model, _build_tokenizers

repo_dir = snapshot_download("bhsinghgrid/sanskrit-translation")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

cfg = CONFIG
model, cfg = load_model(os.path.join(repo_dir, "best_model.pt"), cfg, device)
src_tok, tgt_tok = _build_tokenizers(cfg)

def translate(text: str):
    x = torch.tensor([src_tok.encode(text)], dtype=torch.long, device=device)
    y = model.generate(
        x,
        num_steps=cfg["inference"]["num_steps"],
        temperature=cfg["inference"]["temperature"],
        top_k=cfg["inference"]["top_k"],
        repetition_penalty=cfg["inference"]["repetition_penalty"],
        diversity_penalty=cfg["inference"]["diversity_penalty"],
    )
    ids = [i for i in y[0].tolist() if i > 4]
    return tgt_tok.decode(ids).strip()

print(translate("dharmo rakṣati rakṣitaḥ"))

Important Compatibility Note

This repository is not a native Hugging Face Transformers checkpoint (config.json + model.safetensors).
Therefore it cannot be loaded directly with:
- AutoModelForCausalLM.from_pretrained(...)
- pipeline(...)
It should be used with the provided project runtime (inference.py / API wrapper).

Provenance

The checkpoint exported here came from:

/Users/bhsingh/Documents/Final_Paraphrase/Modify/results8/d3pm_cross_attention_neg_False/best_model.pt

Downloads last month: 2,953

bhsinghgrid
/

sanskrit-translation