Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
sunblaze-ucb
's Collections
Intuitor
Intuitor
updated
Jun 25, 2025
Models in the paper "Learning to Reason without External Rewards"
Upvote
1
sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Aug 13, 2025
•
5
sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Aug 13, 2025
•
16
sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Aug 13, 2025
•
5
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Aug 13, 2025
•
11
sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Aug 13, 2025
•
10
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Aug 13, 2025
•
9
sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Aug 13, 2025
•
5
sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Aug 13, 2025
•
12
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Mar 7
•
4
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Mar 7
•
3
sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Mar 7
•
9
•
1
sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Mar 7
•
5
Upvote
1
Share collection
View history
Collection guide
Browse collections