YuchenLi01
/

generatedMoreUniqueResponseNoGTv2_1.5B_sft_prompt_completion_ebs32_lr2e-05_epoch1.0_42

generatedMoreUniqueResponseNoGTv2_1.5B_sft_prompt_completion_ebs32_lr2e-05_epoch1.0_42

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the YuchenLi01/MATH_Qwen2.5-1.5BInstruct_DPO_MoreUniqueResponseNoGTv2 dataset. It achieves the following results on the evaluation set:

Loss: 0.1390

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss
0.4026	0.0060	20	0.4138
0.1324	0.0120	40	0.1251
0.0832	0.0180	60	0.0836
0.0782	0.0240	80	0.0832
0.0812	0.0300	100	0.0851
0.0883	0.0360	120	0.0886
0.0808	0.0420	140	0.0928
0.0846	0.0480	160	0.0967
0.094	0.0540	180	0.1010
0.0863	0.0600	200	0.1063
0.1118	0.0660	220	0.1090
0.1092	0.0720	240	0.1125
0.0855	0.0780	260	0.1170
0.11	0.0840	280	0.1216
0.1175	0.0900	300	0.1255
0.1193	0.0960	320	0.1288
0.0993	0.1019	340	0.1327
0.106	0.1079	360	0.1349
0.1235	0.1139	380	0.1379
0.1182	0.1199	400	0.1375
0.1283	0.1259	420	0.1406
0.1043	0.1319	440	0.1401
0.1069	0.1379	460	0.1401
0.1065	0.1439	480	0.1435
0.107	0.1499	500	0.1431
0.1286	0.1559	520	0.1431
0.1261	0.1619	540	0.1445
0.1194	0.1679	560	0.1444
0.1183	0.1739	580	0.1454
0.1044	0.1799	600	0.1449
0.0984	0.1859	620	0.1457
0.1109	0.1919	640	0.1475
0.0947	0.1979	660	0.1474
0.096	0.2039	680	0.1468
0.1128	0.2099	700	0.1483
0.1083	0.2159	720	0.1494
0.1049	0.2219	740	0.1484
0.1235	0.2279	760	0.1494
0.1093	0.2339	780	0.1508
0.117	0.2399	800	0.1492
0.1068	0.2459	820	0.1502
0.1053	0.2519	840	0.1494
0.1223	0.2579	860	0.1517
0.1048	0.2639	880	0.1527
0.0966	0.2699	900	0.1521
0.0856	0.2759	920	0.1506
0.1047	0.2819	940	0.1525
0.1245	0.2879	960	0.1528
0.1053	0.2939	980	0.1519
0.0924	0.2999	1000	0.1531
0.1058	0.3058	1020	0.1536
0.1103	0.3118	1040	0.1517
0.0974	0.3178	1060	0.1533
0.1173	0.3238	1080	0.1534
0.1096	0.3298	1100	0.1557
0.099	0.3358	1120	0.1536
0.1013	0.3418	1140	0.1544
0.1043	0.3478	1160	0.1540
0.0936	0.3538	1180	0.1552
0.1015	0.3598	1200	0.1542
0.0959	0.3658	1220	0.1549
0.111	0.3718	1240	0.1556
0.0932	0.3778	1260	0.1558
0.1033	0.3838	1280	0.1564
0.0972	0.3898	1300	0.1557
0.0978	0.3958	1320	0.1539
0.0997	0.4018	1340	0.1547
0.0977	0.4078	1360	0.1529
0.1022	0.4138	1380	0.1550
0.0871	0.4198	1400	0.1547
0.1082	0.4258	1420	0.1545
0.1079	0.4318	1440	0.1544
0.1025	0.4378	1460	0.1560
0.0847	0.4438	1480	0.1555
0.0908	0.4498	1500	0.1545
0.1031	0.4558	1520	0.1533
0.0911	0.4618	1540	0.1530
0.0959	0.4678	1560	0.1538
0.0851	0.4738	1580	0.1547
0.101	0.4798	1600	0.1523
0.1048	0.4858	1620	0.1526
0.0983	0.4918	1640	0.1513
0.0932	0.4978	1660	0.1517
0.0941	0.5037	1680	0.1504
0.0959	0.5097	1700	0.1502
0.0809	0.5157	1720	0.1525
0.081	0.5217	1740	0.1506
0.0911	0.5277	1760	0.1510
0.0776	0.5337	1780	0.1506
0.0916	0.5397	1800	0.1502
0.0782	0.5457	1820	0.1502
0.09	0.5517	1840	0.1489
0.0838	0.5577	1860	0.1500
0.0862	0.5637	1880	0.1489
0.0809	0.5697	1900	0.1490
0.1064	0.5757	1920	0.1471
0.0792	0.5817	1940	0.1480
0.0919	0.5877	1960	0.1480
0.1025	0.5937	1980	0.1479
0.089	0.5997	2000	0.1475
0.1042	0.6057	2020	0.1481
0.0943	0.6117	2040	0.1470
0.0836	0.6177	2060	0.1459
0.0894	0.6237	2080	0.1460
0.0809	0.6297	2100	0.1458
0.0957	0.6357	2120	0.1463
0.1004	0.6417	2140	0.1466
0.0802	0.6477	2160	0.1468
0.1052	0.6537	2180	0.1467
0.0841	0.6597	2200	0.1461
0.0757	0.6657	2220	0.1457
0.0813	0.6717	2240	0.1459
0.0859	0.6777	2260	0.1450
0.0861	0.6837	2280	0.1451
0.0827	0.6897	2300	0.1452
0.0835	0.6957	2320	0.1446
0.092	0.7016	2340	0.1437
0.0734	0.7076	2360	0.1441
0.0755	0.7136	2380	0.1436
0.0905	0.7196	2400	0.1435
0.0846	0.7256	2420	0.1430
0.067	0.7316	2440	0.1429
0.0836	0.7376	2460	0.1425
0.087	0.7436	2480	0.1420
0.0911	0.7496	2500	0.1424
0.0766	0.7556	2520	0.1422
0.0763	0.7616	2540	0.1420
0.0925	0.7676	2560	0.1419
0.0898	0.7736	2580	0.1419
0.0897	0.7796	2600	0.1416
0.0817	0.7856	2620	0.1413
0.087	0.7916	2640	0.1402
0.0801	0.7976	2660	0.1404
0.0849	0.8036	2680	0.1402
0.0745	0.8096	2700	0.1402
0.071	0.8156	2720	0.1404
0.0882	0.8216	2740	0.1405
0.0851	0.8276	2760	0.1404
0.0809	0.8336	2780	0.1404
0.0714	0.8396	2800	0.1400
0.0798	0.8456	2820	0.1398
0.0799	0.8516	2840	0.1400
0.0929	0.8576	2860	0.1400
0.0819	0.8636	2880	0.1400
0.0712	0.8696	2900	0.1400
0.0993	0.8756	2920	0.1399
0.0837	0.8816	2940	0.1397
0.0856	0.8876	2960	0.1397
0.0775	0.8936	2980	0.1395
0.0709	0.8996	3000	0.1395
0.0831	0.9055	3020	0.1395
0.0751	0.9115	3040	0.1395
0.0828	0.9175	3060	0.1393
0.0767	0.9235	3080	0.1392
0.0707	0.9295	3100	0.1391
0.0653	0.9355	3120	0.1392
0.0726	0.9415	3140	0.1392
0.0725	0.9475	3160	0.1391
0.0854	0.9535	3180	0.1391
0.0848	0.9595	3200	0.1391
0.077	0.9655	3220	0.1391
0.0833	0.9715	3240	0.1391
0.0825	0.9775	3260	0.1390
0.0846	0.9835	3280	0.1390
0.0838	0.9895	3300	0.1390
0.0712	0.9955	3320	0.1391

Framework versions

Transformers 4.45.2
Pytorch 2.5.1+cu121
Datasets 3.5.0
Tokenizers 0.20.3

Downloads last month: 2

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for YuchenLi01/generatedMoreUniqueResponseNoGTv2_1.5B_sft_prompt_completion_ebs32_lr2e-05_epoch1.0_42

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Finetuned

(1504)

this model