zhongweixie/Llama-3.2-1B-NEW-dpo-math-5epoch-lr2e-7-bs16-beta0.01-entropy_non_linear Updated Jul 26, 2025
zhongweixie/Llama-3.2-1B-NEW-dpo-math-30epoch-lr2e-7-bs16-beta0.01-entropy_non_linear_15001 Updated Jul 1, 2025
zhongweixie/Llama-3.2-1B-NEW-dpo-math-30epoch-lr2e-7-bs16-beta0.01-entropy_non_linear_1500 Text Generation • 1B • Updated Jul 1, 2025 • 2