GXPO-math-llama
Checkpoints from run 'gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_h' for iso-BP comparison.
Updated • 23Note run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=best_checkpoint
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-5967cee3
UpdatedNote run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=bp_budget_108
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-48d1aff4
UpdatedNote run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=bp_budget_204
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-207002a8
UpdatedNote run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=bp_budget_300
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-93da31bc
UpdatedNote run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=bp_budget_408
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-69e7bbed
Updated • 22Note run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=checkpoint-356
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-707e7dca
Updated • 22Note run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=checkpoint-378
swapnil7777/gxpo-gxpo-llama-iso-3b-k-5-shutoff-trajectory-aware-hendrycks-math-seed42-20260413-1610-51b407f4
Updated • 22Note run=gxpo_llama_iso_3B_k_5_shutoff_trajectory_aware_hendrycks_math_seed42_20260413_161029 | checkpoint=checkpoint-379