Length Value Model
Collection
11 items • Updated
This model is a fine-tuned version of Qwen/Qwen3-1.7B-Base on the 30b_a3b_math_95k_16_train dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Token Mean Mae | Token Mean Relerr | Token Mean Rmse | Token Mean Seq Mean Mae | Token Mean Seq Mean Relerr | Token Mean Seq Mean Rmse |
|---|---|---|---|---|---|---|---|---|---|
| 0.8838 | 0.0337 | 50 | 0.0152 | 31962873332.2035 | 1.2905 | 9067506.1269 | 3893413.1969 | 1.5436 | 67383.2700 |
| 0.9150 | 0.0675 | 100 | 0.0130 | 29082752791.3909 | 1.0013 | 8487870.1899 | 3541616.5108 | 1.2401 | 62469.0928 |
| 0.7341 | 0.1012 | 150 | 0.0121 | 28461424534.9630 | 0.9021 | 8596978.0872 | 3464854.9661 | 1.0606 | 59611.2344 |
| 0.6296 | 0.1350 | 200 | 0.0114 | 26886060544.5233 | 0.8831 | 7827202.9967 | 3272673.0010 | 1.2330 | 59088.6549 |
| 0.6926 | 0.1687 | 250 | 0.0111 | 26591412583.9368 | 0.8925 | 7834076.2851 | 3236449.6441 | 1.1603 | 57334.6885 |
| 0.6139 | 0.2025 | 300 | 0.0108 | 26702937964.4749 | 0.7990 | 8092083.5720 | 3249971.2530 | 0.9607 | 56151.5083 |
| 0.5502 | 0.2362 | 350 | 0.0107 | 26309989800.4493 | 0.7767 | 7943355.2861 | 3202539.2777 | 1.0370 | 55998.6088 |
| 0.5278 | 0.2700 | 400 | 0.0111 | 27214946205.0021 | 0.7060 | 8392862.9583 | 3311882.0626 | 0.9070 | 56098.4729 |
| 0.4964 | 0.3037 | 450 | 0.0107 | 26375763016.5343 | 0.7337 | 8052422.8244 | 3210015.3622 | 0.9549 | 55505.9211 |
| 0.4826 | 0.3374 | 500 | 0.0107 | 26174723423.1948 | 0.7004 | 8016074.2561 | 3185610.6003 | 0.8731 | 54969.6635 |
| 0.4651 | 0.3712 | 550 | 0.0106 | 25923456973.7465 | 0.7817 | 7764392.2592 | 3155471.7348 | 1.0857 | 55274.2759 |
| 0.4350 | 0.4049 | 600 | 0.0107 | 26005477322.2059 | 0.7476 | 7835539.6734 | 3164525.7692 | 1.0798 | 54937.2762 |
| 0.4476 | 0.4387 | 650 | 0.0107 | 27094116153.3255 | 0.6869 | 8314936.4869 | 3297115.6696 | 0.8879 | 55315.1395 |
| 0.4649 | 0.4724 | 700 | 0.0103 | 26127305157.3229 | 0.7682 | 7911443.6463 | 3179186.8054 | 1.0535 | 54562.6628 |
| 0.4550 | 0.5062 | 750 | 0.0107 | 26642414381.7156 | 0.6877 | 8177778.9883 | 3241904.4287 | 0.9428 | 54600.9327 |
| 0.4041 | 0.5399 | 800 | 0.0109 | 26305712502.5043 | 0.5907 | 8205801.4943 | 3201468.6334 | 0.7266 | 53629.8888 |
| 0.4041 | 0.5737 | 850 | 0.0104 | 25481182626.3355 | 0.6144 | 7828342.2509 | 3101531.0826 | 0.8132 | 53606.8506 |
| 0.4068 | 0.6074 | 900 | 0.0103 | 25884806288.5230 | 0.7346 | 7809702.6584 | 3150509.8743 | 1.0223 | 54361.4305 |
| 0.3763 | 0.6411 | 950 | 0.0104 | 25686105330.9954 | 0.6695 | 7858742.2040 | 3125837.7557 | 0.8839 | 53570.9686 |
| 0.4063 | 0.6749 | 1000 | 0.0105 | 25726341541.9370 | 0.6561 | 7836993.6156 | 3131141.7075 | 0.8876 | 53564.3389 |
| 0.3715 | 0.7086 | 1050 | 0.0104 | 25577657636.7409 | 0.6572 | 7896683.5400 | 3112295.4313 | 0.8670 | 52933.0834 |
| 0.4101 | 0.7424 | 1100 | 0.0104 | 25697994539.2222 | 0.6821 | 7962045.3988 | 3128070.5844 | 0.9236 | 53051.6250 |
| 0.3296 | 0.7761 | 1150 | 0.0107 | 25976736045.6458 | 0.6162 | 8096339.3832 | 3161779.6041 | 0.7932 | 52897.0733 |
| 0.3780 | 0.8099 | 1200 | 0.0104 | 25615151552.0550 | 0.6382 | 7937833.8960 | 3117088.2811 | 0.8487 | 52774.9458 |
| 0.3768 | 0.8436 | 1250 | 0.0105 | 25941770840.4481 | 0.6784 | 7899501.1527 | 3156785.4411 | 0.9677 | 53674.3792 |
| 0.3788 | 0.8774 | 1300 | 0.0103 | 25405377328.5631 | 0.6109 | 7866119.5046 | 3091528.6967 | 0.7914 | 52509.0489 |
| 0.3696 | 0.9111 | 1350 | 0.0103 | 25532641757.3267 | 0.7108 | 7854749.4646 | 3107104.7411 | 0.9830 | 52819.0236 |
| 0.3682 | 0.9448 | 1400 | 0.0104 | 25367025567.7703 | 0.6413 | 7852615.6312 | 3086526.1212 | 0.8465 | 52453.1452 |
| 0.3389 | 0.9786 | 1450 | 0.0103 | 25576755953.8150 | 0.6636 | 7963191.7338 | 3112153.6540 | 0.8621 | 52445.6380 |
| 0.3944 | 1.0121 | 1500 | 0.0103 | 25320670727.8167 | 0.6504 | 7864982.3833 | 3081025.3964 | 0.8569 | 52112.9809 |
| 0.3105 | 1.0459 | 1550 | 0.0103 | 25486237905.7501 | 0.6173 | 7879228.6209 | 3100864.6937 | 0.8305 | 52328.9515 |
| 0.2800 | 1.0796 | 1600 | 0.0103 | 25488200028.0725 | 0.6656 | 7883183.3556 | 3101773.2576 | 0.9099 | 52448.8037 |
| 0.3174 | 1.1134 | 1650 | 0.0104 | 25861998661.0543 | 0.6113 | 7982692.4526 | 3146830.2875 | 0.8534 | 52835.8204 |
| 0.3140 | 1.1471 | 1700 | 0.0102 | 25521483105.0387 | 0.6678 | 7877753.0629 | 3105780.1889 | 0.9222 | 52315.0782 |
| 0.3094 | 1.1809 | 1750 | 0.0102 | 25419660857.8242 | 0.6686 | 7818757.9673 | 3093012.4016 | 0.9424 | 52418.1496 |
| 0.3283 | 1.2146 | 1800 | 0.0102 | 25102394546.4403 | 0.6894 | 7745066.5962 | 3055248.9459 | 0.9429 | 52037.3113 |
| 0.3099 | 1.2484 | 1850 | 0.0103 | 25604987186.0661 | 7944719.9374 | 3116075.6429 | 52485.8584 | 0.6500 | 0.9077 |
| 0.3335 | 1.2821 | 1900 | 0.0105 | 25588991772.1383 | 7976323.3992 | 3113838.4760 | 52169.5469 | 0.6070 | 0.8091 |
| 0.2886 | 1.3158 | 1950 | 0.0100 | 25196339839.8400 | 7774726.7876 | 3066369.9007 | 52171.6559 | 0.6687 | 0.9250 |
| 0.3390 | 1.3496 | 2000 | 0.0102 | 25348500890.8203 | 7850771.5124 | 3084686.9669 | 52099.1455 | 0.6378 | 0.8852 |
| 0.3311 | 1.3833 | 2050 | 0.0102 | 25315336130.1431 | 7864825.8588 | 3080836.5537 | 52034.2528 | 0.6473 | 0.8774 |
| 0.3267 | 1.4171 | 2100 | 0.0101 | 25085917619.3055 | 7637916.0089 | 3053308.5697 | 52557.2485 | 0.7049 | 1.0112 |
| 0.3251 | 1.4508 | 2150 | 0.0102 | 25324994805.8852 | 7870383.6193 | 3082062.2541 | 51898.1139 | 0.6509 | 0.8885 |
| 0.2910 | 1.4846 | 2200 | 0.0101 | 25160413607.9405 | 7790770.0594 | 3061667.9161 | 51900.3286 | 0.6562 | 0.9123 |
| 0.3220 | 1.5183 | 2250 | 0.0102 | 25440209435.1005 | 7869804.5901 | 3095594.9222 | 52125.3182 | 0.6508 | 0.9117 |
| 0.2914 | 1.5521 | 2300 | 0.0102 | 25518166445.5846 | 7879342.7310 | 3105138.3217 | 52255.2315 | 0.6712 | 0.9555 |
| 0.2871 | 1.5858 | 2350 | 0.0101 | 25144737131.4202 | 7778736.8958 | 3059855.0501 | 51879.8562 | 0.6620 | 0.9193 |
| 0.3406 | 1.6196 | 2400 | 0.0101 | 25245011524.3196 | 7818310.1363 | 3072268.1235 | 51888.5041 | 0.6657 | 0.9357 |
| 0.3289 | 1.6533 | 2450 | 0.0101 | 25212716686.9061 | 7774020.1217 | 3068524.1634 | 52036.5929 | 0.6817 | 0.9712 |
| 0.3103 | 1.6870 | 2500 | 0.0101 | 25199619583.6928 | 7802074.2305 | 3066695.1766 | 51900.3792 | 0.6777 | 0.9584 |
| 0.2776 | 1.7208 | 2550 | 0.0101 | 25249907548.2346 | 7854459.2565 | 3072602.8266 | 51775.4323 | 0.6385 | 0.8874 |
| 0.2726 | 1.7545 | 2600 | 0.0102 | 25253709515.6708 | 7843764.1642 | 3073097.8076 | 51811.0960 | 0.6519 | 0.9082 |
| 0.3006 | 1.7883 | 2650 | 0.0101 | 25252189159.9992 | 7837001.5254 | 3072967.6592 | 51812.2177 | 0.6503 | 0.9112 |
| 0.3168 | 1.8220 | 2700 | 0.0101 | 25202676950.5567 | 7824637.5506 | 3066863.8581 | 51768.4061 | 0.6489 | 0.9075 |
| 0.3199 | 1.8558 | 2750 | 0.0101 | 25252600129.8480 | 7835128.3652 | 3072959.5303 | 51824.7472 | 0.6557 | 0.9195 |
| 0.2877 | 1.8895 | 2800 | 0.0101 | 25183954796.2180 | 7813811.6609 | 3064627.4014 | 51755.0915 | 0.6560 | 0.9180 |
| 0.3383 | 1.9233 | 2850 | 0.0101 | 25204580360.5053 | 7822050.6876 | 3067122.0695 | 51757.7305 | 0.6527 | 0.9116 |
| 0.3070 | 1.9570 | 2900 | 0.0101 | 25188540475.4139 | 7813731.1015 | 3065199.7823 | 51756.3663 | 0.6544 | 0.9144 |
| 0.3022 | 1.9907 | 2950 | 0.0101 | 25189441970.3433 | 7814560.9700 | 3065276.1438 | 51757.7307 | 0.6544 | 0.9143 |
Base model
Qwen/Qwen3-1.7B-Base