Intuitor - a sunblaze-ucb Collection

sunblaze-ucb 's Collections

Intuitor

updated Jun 25, 2025

Models in the paper "Learning to Reason without External Rewards"

sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH

Text Generation • 3B • Updated Aug 13, 2025 • 5
sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH

Text Generation • 2B • Updated Aug 13, 2025 • 16
sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH

Text Generation • 15B • Updated Aug 13, 2025 • 5
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH

Text Generation • 7B • Updated Aug 13, 2025 • 11
sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH

Text Generation • 15B • Updated Aug 13, 2025 • 10
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH

Text Generation • 7B • Updated Aug 13, 2025 • 9
sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH

Text Generation • 3B • Updated Aug 13, 2025 • 5
sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH

Text Generation • 2B • Updated Aug 13, 2025 • 12
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP

Text Generation • 7B • Updated Mar 7 • 4
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP

Text Generation • 7B • Updated Mar 7 • 3
sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH

Text Generation • 4B • Updated Mar 7 • 9 • 1
sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH

Text Generation • 4B • Updated Mar 7 • 5