indicbart-hindi-gec-v1

This model is a fine-tuned version of ai4bharat/IndicBART on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1612

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
1.1508	0.0400	200	0.9250
0.7768	0.0801	400	0.5801
0.4865	0.1201	600	0.3398
0.3966	0.1602	800	0.2952
0.3574	0.2002	1000	0.2807
0.3473	0.2402	1200	0.2684
0.3254	0.2803	1400	0.2534
0.2989	0.3203	1600	0.2348
0.2735	0.3604	1800	0.2198
0.2517	0.4004	2000	0.2086
0.2593	0.4404	2200	0.2024
0.2488	0.4805	2400	0.1985
0.2496	0.5205	2600	0.1954
0.2385	0.5606	2800	0.1924
0.2341	0.6006	3000	0.1905
0.2201	0.6406	3200	0.1877
0.2256	0.6807	3400	0.1855
0.2388	0.7207	3600	0.1845
0.2484	0.7608	3800	0.1831
0.21	0.8008	4000	0.1799
0.2168	0.8408	4200	0.1820
0.225	0.8809	4400	0.1790
0.2202	0.9209	4600	0.1765
0.2032	0.9610	4800	0.1765
0.2013	1.0010	5000	0.1742
0.1922	1.0410	5200	0.1721
0.1927	1.0811	5400	0.1724
0.1982	1.1211	5600	0.1704
0.2036	1.1612	5800	0.1717
0.1838	1.2012	6000	0.1690
0.2026	1.2412	6200	0.1710
0.1876	1.2813	6400	0.1684
0.1785	1.3213	6600	0.1684
0.1808	1.3614	6800	0.1693
0.1873	1.4014	7000	0.1689
0.1859	1.4414	7200	0.1667
0.1865	1.4815	7400	0.1653
0.186	1.5215	7600	0.1655
0.1891	1.5616	7800	0.1654
0.184	1.6016	8000	0.1646
0.1784	1.6416	8200	0.1644
0.1717	1.6817	8400	0.1636
0.1841	1.7217	8600	0.1636
0.1785	1.7618	8800	0.1623
0.1659	1.8018	9000	0.1625
0.1741	1.8418	9200	0.1629
0.1715	1.8819	9400	0.1623
0.1638	1.9219	9600	0.1627
0.1873	1.9620	9800	0.1623
0.1594	2.0020	10000	0.1622
0.172	2.0420	10200	0.1620
0.1628	2.0821	10400	0.1628
0.1661	2.1221	10600	0.1619
0.1773	2.1622	10800	0.1617
0.1689	2.2022	11000	0.1621
0.1655	2.2422	11200	0.1613
0.159	2.2823	11400	0.1614
0.1596	2.3223	11600	0.1613
0.16	2.3624	11800	0.1615
0.1632	2.4024	12000	0.1616
0.1661	2.4424	12200	0.1623
0.1565	2.4825	12400	0.1619
0.1583	2.5225	12600	0.1618
0.1512	2.5626	12800	0.1621
0.1617	2.6026	13000	0.1619
0.1646	2.6426	13200	0.1616
0.1473	2.6827	13400	0.1616
0.1648	2.7227	13600	0.1615
0.1524	2.7628	13800	0.1614
0.1678	2.8028	14000	0.1613
0.1637	2.8428	14200	0.1614
0.154	2.8829	14400	0.1615
0.1546	2.9229	14600	0.1613
0.1608	2.9630	14800	0.1612

Framework versions

Transformers 4.53.3
Pytorch 2.6.0+cu124
Datasets 4.4.1
Tokenizers 0.21.2

Downloads last month: 6

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support