Length Value Model
Collection
10 items • Updated
This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the 7b_instruction_100k_16_train dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Token Mean Mae | Token Mean Relerr | Token Mean Rmse | Token Mean Seq Mean Mae | Token Mean Seq Mean Relerr | Token Mean Seq Mean Rmse |
|---|---|---|---|---|---|---|---|---|---|
| 0.0082 | 0.032 | 50 | 0.0080 | 608425155.8249 | 0.4860 | 634337.1071 | 49016.6042 | 0.8634 | 2066.2285 |
| 0.0049 | 0.064 | 100 | 0.0052 | 488127386.7483 | 0.3914 | 536577.3978 | 38943.8809 | 0.7013 | 1669.2678 |
| 0.0049 | 0.096 | 150 | 0.0048 | 475120389.5790 | 0.3856 | 531628.7342 | 37852.7598 | 0.6617 | 1610.7651 |
| 0.0048 | 0.128 | 200 | 0.0049 | 481805109.8853 | 0.3577 | 533769.3790 | 38637.7402 | 0.6499 | 1623.8317 |
| 0.0042 | 0.16 | 250 | 0.0045 | 451726004.4679 | 0.3549 | 511599.7858 | 36269.7874 | 0.6364 | 1546.9978 |
| 0.0043 | 0.192 | 300 | 0.0045 | 452136195.4396 | 0.3354 | 506556.3922 | 36330.8552 | 0.5529 | 1537.9397 |
| 0.0039 | 0.224 | 350 | 0.0043 | 431075755.8509 | 0.3767 | 483720.3007 | 35022.8476 | 0.7019 | 1508.8952 |
| 0.0040 | 0.256 | 400 | 0.0042 | 428685113.7043 | 0.3644 | 488718.9404 | 34441.1474 | 0.7210 | 1482.5354 |
| 0.0044 | 0.288 | 450 | 0.0043 | 436808096.1055 | 0.3618 | 490426.2621 | 35491.9788 | 0.6338 | 1509.2700 |
| 0.0036 | 0.32 | 500 | 0.0043 | 425259647.9635 | 0.3842 | 475208.2751 | 34596.7729 | 0.6689 | 1503.8005 |
| 0.0038 | 0.352 | 550 | 0.0041 | 424481052.6458 | 0.3325 | 485194.9456 | 33965.9044 | 0.5568 | 1459.7803 |
| 0.0042 | 0.384 | 600 | 0.0041 | 416627471.4934 | 0.3477 | 473553.8559 | 33659.7396 | 0.6224 | 1451.3301 |
| 0.0043 | 0.416 | 650 | 0.0041 | 418919517.4599 | 0.3307 | 477484.7566 | 33742.9811 | 0.5391 | 1452.1818 |
| 0.0043 | 0.448 | 700 | 0.0041 | 425921146.5534 | 0.3072 | 490523.6028 | 34315.8195 | 0.4590 | 1471.0800 |
| 0.0036 | 0.48 | 750 | 0.0041 | 412971921.8994 | 0.3083 | 472029.2194 | 33441.2669 | 0.5165 | 1443.8789 |
| 0.0041 | 0.512 | 800 | 0.0040 | 421228461.5054 | 0.3372 | 485035.1077 | 33899.3248 | 0.5555 | 1450.2831 |
| 0.0041 | 0.544 | 850 | 0.0040 | 409085666.9095 | 0.3424 | 466717.6581 | 33191.6016 | 0.5851 | 1440.9357 |
| 0.0038 | 0.576 | 900 | 0.0042 | 412179388.5635 | 0.3695 | 467476.0377 | 33627.2967 | 0.6272 | 1451.9416 |
| 0.0036 | 0.608 | 950 | 0.0041 | 408180795.0375 | 0.3463 | 462878.8329 | 33098.6940 | 0.5822 | 1435.1700 |
| 0.0042 | 0.64 | 1000 | 0.0040 | 414082129.8004 | 0.3051 | 474770.3497 | 33469.6209 | 0.4600 | 1436.9670 |
| 0.0033 | 0.672 | 1050 | 0.0040 | 416573746.2561 | 0.3010 | 478667.6112 | 33390.7520 | 0.4735 | 1432.4401 |
| 0.0037 | 0.704 | 1100 | 0.0039 | 409080410.9753 | 0.3223 | 471784.7408 | 33069.0180 | 0.5653 | 1423.4942 |
| 0.0039 | 0.736 | 1150 | 0.0040 | 414529748.5283 | 0.3060 | 480363.9079 | 33461.9558 | 0.4526 | 1430.7729 |
| 0.0039 | 0.768 | 1200 | 0.0038 | 408060349.0209 | 0.3099 | 472453.4875 | 33012.9848 | 0.5047 | 1414.1793 |
| 0.0035 | 0.8 | 1250 | 0.0039 | 410530731.4820 | 0.3277 | 478798.7453 | 32934.7524 | 0.5611 | 1412.0277 |
| 0.0039 | 0.832 | 1300 | 0.0039 | 400444054.3551 | 0.3261 | 457524.1809 | 32501.9058 | 0.5401 | 1411.6451 |
| 0.0034 | 0.864 | 1350 | 0.0038 | 401191161.5502 | 0.3162 | 461911.0799 | 32531.2734 | 0.5506 | 1402.6450 |
| 0.0036 | 0.896 | 1400 | 0.0039 | 411589472.5259 | 0.3036 | 479207.3545 | 33028.4667 | 0.4974 | 1412.5106 |
| 0.0039 | 0.928 | 1450 | 0.0039 | 403316449.4579 | 0.2976 | 461425.6247 | 32678.8754 | 0.4571 | 1401.9725 |
| 0.0036 | 0.96 | 1500 | 0.0038 | 397996640.9071 | 0.3197 | 460458.7850 | 32191.5058 | 0.5528 | 1388.1403 |
| 0.0036 | 0.992 | 1550 | 0.0038 | 397878383.1569 | 0.3172 | 459268.7323 | 32082.3904 | 0.5086 | 1390.6809 |
| 0.0033 | 1.0237 | 1600 | 0.0038 | 403549105.5117 | 0.3001 | 470122.2352 | 32460.6207 | 0.4609 | 1393.3032 |
| 0.0032 | 1.0557 | 1650 | 0.0038 | 404166162.9221 | 0.2993 | 476033.3722 | 32285.2127 | 0.4485 | 1389.7234 |
| 0.0031 | 1.0877 | 1700 | 0.0038 | 393612283.8630 | 0.3053 | 460389.1674 | 31846.0432 | 0.5003 | 1382.9924 |
| 0.0031 | 1.1197 | 1750 | 0.0039 | 401458449.5915 | 0.3046 | 464624.1376 | 32486.4189 | 0.4536 | 1397.6015 |
| 0.0030 | 1.1517 | 1800 | 0.0038 | 387094167.9261 | 0.3162 | 442174.7457 | 31852.4104 | 0.5006 | 1384.3810 |
| 0.0028 | 1.1837 | 1850 | 0.0038 | 394617804.1685 | 0.2920 | 462627.0268 | 31840.2037 | 0.4669 | 1375.5938 |
| 0.0032 | 1.2157 | 1900 | 0.0038 | 392901609.9683 | 0.2958 | 461826.8537 | 31643.1195 | 0.4796 | 1370.3818 |
| 0.0029 | 1.2477 | 1950 | 0.0038 | 401406658.1758 | 0.3061 | 471571.4465 | 32139.0364 | 0.5093 | 1378.6055 |
| 0.0027 | 1.2797 | 2000 | 0.0038 | 393478169.1530 | 0.2977 | 462308.5230 | 31684.1146 | 0.4902 | 1366.1339 |
| 0.0031 | 1.3117 | 2050 | 0.0038 | 394208720.6071 | 0.2976 | 462952.6062 | 31793.0456 | 0.4881 | 1371.8230 |
| 0.0031 | 1.3437 | 2100 | 0.0037 | 396781398.3369 | 0.3049 | 463251.8912 | 32050.0905 | 0.4972 | 1377.4668 |
| 0.0036 | 1.3757 | 2150 | 0.0037 | 395437093.0954 | 0.3051 | 462512.9335 | 31641.5281 | 0.4831 | 1366.7757 |
| 0.0031 | 1.4077 | 2200 | 0.0037 | 395949867.5542 | 0.3025 | 461985.5436 | 31789.9705 | 0.4970 | 1368.6927 |
| 0.0031 | 1.4397 | 2250 | 0.0037 | 386212822.5036 | 0.3106 | 445312.4503 | 31486.1063 | 0.5107 | 1370.1466 |
| 0.0028 | 1.4717 | 2300 | 0.0038 | 389202214.0629 | 0.3226 | 449137.1804 | 31690.3852 | 0.5504 | 1376.1187 |
| 0.0033 | 1.5037 | 2350 | 0.0037 | 391200473.4113 | 0.3053 | 458915.1965 | 31654.9283 | 0.4863 | 1364.4807 |
| 0.0030 | 1.5357 | 2400 | 0.0037 | 388488638.5763 | 0.3043 | 454479.3683 | 31442.2902 | 0.4930 | 1360.3722 |
| 0.0030 | 1.5677 | 2450 | 0.0037 | 392525139.1263 | 0.3009 | 460564.8294 | 31699.8567 | 0.4748 | 1365.3168 |
| 0.0030 | 1.5997 | 2500 | 0.0037 | 393120724.2463 | 0.2990 | 462593.0911 | 31594.6174 | 0.4850 | 1362.0217 |
| 0.0026 | 1.6317 | 2550 | 0.0037 | 386413276.0903 | 0.3103 | 450750.4296 | 31360.8288 | 0.5154 | 1361.8087 |
| 0.0029 | 1.6637 | 2600 | 0.0037 | 387943931.7722 | 0.2964 | 454269.9894 | 31459.7615 | 0.4839 | 1359.0512 |
| 0.0030 | 1.6957 | 2650 | 0.0037 | 387453941.8684 | 453519.8739 | 31395.3375 | 1357.8446 | 0.2977 | 0.4791 |
| 0.0030 | 1.7277 | 2700 | 0.0037 | 388700001.2806 | 455841.6119 | 31446.9061 | 1357.2810 | 0.2988 | 0.4788 |
| 0.0029 | 1.7597 | 2750 | 0.0037 | 388746166.5099 | 456414.0732 | 31448.9568 | 1356.8091 | 0.2962 | 0.4697 |
| 0.0032 | 1.7917 | 2800 | 0.0037 | 388479514.7293 | 456098.5994 | 31386.6973 | 1355.9567 | 0.2973 | 0.4799 |
| 0.0028 | 1.8237 | 2850 | 0.0037 | 386241442.0539 | 452872.4607 | 31286.2562 | 1354.1872 | 0.3001 | 0.4876 |
| 0.0029 | 1.8557 | 2900 | 0.0037 | 386503821.0774 | 453532.6895 | 31259.5156 | 1353.4503 | 0.2998 | 0.4842 |
| 0.0030 | 1.8877 | 2950 | 0.0037 | 387462689.0075 | 455169.0309 | 31311.5292 | 1353.9187 | 0.3002 | 0.4874 |
| 0.0031 | 1.9197 | 3000 | 0.0037 | 386798994.6583 | 454121.0297 | 31273.6807 | 1353.7956 | 0.3009 | 0.4855 |
| 0.0033 | 1.9517 | 3050 | 0.0037 | 387538013.3891 | 455192.9192 | 31321.1101 | 1354.1975 | 0.2992 | 0.4847 |
| 0.0028 | 1.9837 | 3100 | 0.0037 | 386674178.7403 | 453785.7409 | 31281.2366 | 1353.7729 | 0.3004 | 0.4870 |