VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Paper • 2506.09942 • Published • 5
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
This folder contains four JSONL files that form a minimal example for merged training and evaluation across two tasks: math reasoning and instruction following.
| Lines | 17,398 |
| Source | zhuzilin/dapo-math-17k |
| Reference | DAPO paper (ByteDance Seed) |
Math reasoning training data originally released alongside the DAPO paper. Each line includes a data_source field set to dapo-math-17k.
| Lines | 19,756 |
| Source | THU-KEG/VerInstruct |
| Reference | VerInstruct paper |
Instruction-following training data. The original dataset provides both hard (function-verifiable) and soft (LLM-judge rubric-based) reward signals. For simplicity, only items with hard constraints are included here; soft constraints have been removed.
| Lines | 32 |
| Source | zhuzilin/aime-2024 |
| Task | Math evaluation |
| Lines | 300 |
| Source | zyzshishui0627/IFBench |
| Task | Instruction-following evaluation |