od2961/qwen2.5-Math-1.5b-drgrpo-readmeflash-a6000-5epoch-seed42

This repo stores organized DR.GRPO checkpoints for seed 42 from our Qwen2.5-Math-1.5B readme-flash math-reasoning training run.

Layout

  • checkpoints/step_XXXXX/: exported model checkpoints ready for inference
  • eval_results/: per-benchmark JSON outputs saved at eval boundaries
  • metadata/checkpoints_index.json: machine-readable checkpoint manifest
  • metadata/run_manifest.json: local run provenance

Source Run Root

  • Local save root: /n/fs/similarity/maxent-grpo/var/data/oat_zero_exact_1p5b_20260412_031159_seed42
  • Objective: dr.grpo

Available Checkpoints

Step Size (GiB) Eval files
step_00000 2.89 aime, amc, math, minerva, olympiad_bench
step_00016 2.89 aime, amc, math, minerva, olympiad_bench
step_00032 2.89 aime, amc, math, minerva, olympiad_bench
step_00048 2.89 aime, amc, math, minerva, olympiad_bench
step_00064 2.89 aime, amc, math, minerva, olympiad_bench
step_00080 2.89 aime, amc, math, minerva, olympiad_bench
step_00096 2.89 aime, amc, math, minerva, olympiad_bench
step_00112 2.89 aime, amc, math, minerva, olympiad_bench
step_00128 2.89 aime, amc, math, minerva, olympiad_bench
step_00144 2.89 aime, amc, math, minerva, olympiad_bench
step_00160 2.89 aime, amc, math, minerva, olympiad_bench
step_00176 2.89 aime, amc, math, minerva, olympiad_bench
step_00192 2.89 aime, amc, math, minerva, olympiad_bench
step_00208 2.89 aime, amc, math, minerva, olympiad_bench
step_00224 2.89 aime, amc, math, minerva, olympiad_bench
step_00240 2.89 aime, amc, math, minerva, olympiad_bench
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for od2961/qwen2.5-Math-1.5b-drgrpo-readmeflash-a6000-5epoch-seed42

Finetuned
(167)
this model