sanskrit-translation
This repository contains the exported checkpoint and runtime files for the Sanskrit IAST -> Devanagari D3PM cross-attention model.
Included Files
best_model.pt: primary demo checkpointbest_val_model.pt: best validation-loss checkpoint (if available)quality_predictor.pt: Task 5 quality predictor (if available)project_config.json: serialized training/inference configsanskrit_src_tokenizer_v1000.jsonsanskrit_tgt_tokenizer_v2000.jsoninference.pymodels/diffusion/
Local Load Example
git clone https://huggingface.co/bhsinghgrid/sanskrit-translation
cd sanskrit-translation
python inference.py --model best_model.pt --cli
Python Download Example
from huggingface_hub import snapshot_download
repo_dir = snapshot_download("bhsinghgrid/sanskrit-translation")
print("Downloaded to:", repo_dir)
Transformer-Style Usage (Custom Runtime)
This model is custom D3PM architecture, so load it with the included runtime:
import os
import torch
from huggingface_hub import snapshot_download
from config import CONFIG
from inference import load_model, _build_tokenizers
repo_dir = snapshot_download("bhsinghgrid/sanskrit-translation")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cfg = CONFIG
model, cfg = load_model(os.path.join(repo_dir, "best_model.pt"), cfg, device)
src_tok, tgt_tok = _build_tokenizers(cfg)
def translate(text: str):
x = torch.tensor([src_tok.encode(text)], dtype=torch.long, device=device)
y = model.generate(
x,
num_steps=cfg["inference"]["num_steps"],
temperature=cfg["inference"]["temperature"],
top_k=cfg["inference"]["top_k"],
repetition_penalty=cfg["inference"]["repetition_penalty"],
diversity_penalty=cfg["inference"]["diversity_penalty"],
)
ids = [i for i in y[0].tolist() if i > 4]
return tgt_tok.decode(ids).strip()
print(translate("dharmo rakṣati rakṣitaḥ"))
Important Compatibility Note
- This repository is not a native Hugging Face Transformers checkpoint (
config.json+model.safetensors). - Therefore it cannot be loaded directly with:
AutoModelForCausalLM.from_pretrained(...)pipeline(...)
- It should be used with the provided project runtime (
inference.py/ API wrapper).
Provenance
The checkpoint exported here came from:
/Users/bhsingh/Documents/Final_Paraphrase/Modify/results8/d3pm_cross_attention_neg_False/best_model.pt
- Downloads last month
- 2,953