view article Article Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty? zhangchenxu • Feb 25 • 14
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL Paper • 2505.23977 • Published May 29, 2025 • 10
VisualSphinx-V1 Collection VisualSphinx-V1 is the largest fully-synthetic open-source dataset providing vision logic puzzles. • 7 items • Updated Jun 3, 2025 • 1
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20, 2025 • 13