TMLR-Group-HF/Self-Certainty-Qwen3-1.7B-Base-MATH Text Generation • 2B • Updated Oct 11, 2025 • 9 • 1
TMLR-Group-HF/Self-Certainty-Llama-3.2-3B-Instruct-MATH Text Generation • 4B • Updated Oct 11, 2025 • 10
TMLR-Group-HF/Co-rewarding-I-Llama-3.2-3B-Instruct-MATH Text Generation • 4B • Updated Oct 11, 2025 • 10
TMLR-Group-HF/Self-Certainty-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 8 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-8B-Base-MATH Text Generation • 8B • Updated Oct 11, 2025 • 122 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-4B-Base-MATH Text Generation • 4B • Updated Oct 11, 2025 • 43 • 1
TMLR-Group-HF/Co-rewarding-II-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 7 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-1.7B-Base-MATH Text Generation • 2B • Updated Oct 11, 2025 • 6
TMLR-Group-HF/Self-Certainty-Qwen3-8B-Base-MATH Text Generation • 8B • Updated Oct 11, 2025 • 10 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 6 • 1
TMLR-Group-HF/Entropy-Qwen3-8B-Base-OpenRS Text Generation • 8B • Updated Oct 11, 2025 • 7 • 1
TMLR-Group-HF/Entropy-Qwen3-8B-Base-DAPO14k Text Generation • 8B • Updated Oct 11, 2025 • 11 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen2.5-3B-MATH Text Generation • 3B • Updated Oct 11, 2025 • 5 • 1