English to Hindi Transformer (Optuna Optimized)
Custom PyTorch Transformer tuned with Optuna + SuccessiveHalving (ASHA).
Results
| Model | Epochs | Loss | BLEU |
|---|---|---|---|
| Baseline | 100 | 0.1484 | 0.5123 |
| Best Tuned | 20 | 0.3500 | 0.5260 |
Best Hyperparameters
{
"d_model": 512,
"num_heads": 8,
"num_enc_layers": 4,
"num_dec_layers": 3,
"d_ff": 2048,
"dropout": 0.05109771787083142,
"lr": 0.00020147922100273766,
"batch_size": 128
}
Usage
import torch, pickle, json
from huggingface_hub import hf_hub_download
weights = hf_hub_download("Saumya3007/en-hi-transformer-tuned", "rollno_ass_4_best_model.pth")
cfg = json.load(open(hf_hub_download("Saumya3007/en-hi-transformer-tuned", "best_config.json")))
model = Transformer(src_vocab, tgt_vocab, **{k: cfg[k] for k in
['d_model','num_heads','num_enc_layers','num_dec_layers','d_ff','dropout']})
model.load_state_dict(torch.load(weights, map_location='cpu'))
model.eval()