Spaces:
Sleeping
Sleeping
Centralized Prompt Configuration
All prompts for question generation and solution generation are centralized in src/config/prompts.py.
Why Centralized?
- Consistency: Same prompts across SFT training, GRPO, PPO, and inference
- Maintainability: Single source of truth for all prompt text
- Flexibility: Easy to tune prompts without hunting through multiple files
Usage
Question Generation
from src.config.prompts import create_generator_messages
instruction = "Generate a problem about fractions in a shopping context"
messages = create_generator_messages(instruction)
# Returns:
# [
# {"role": "system", "content": GENERATOR_SYSTEM_PROMPT},
# {"role": "user", "content": "### Task: Generate Question\n{instruction}"}
# ]
Solution Generation
from src.config.prompts import create_solver_messages
question = "If John has 5 apples and gives 2 away, how many does he have?"
messages = create_solver_messages(question)
# Returns:
# [
# {"role": "system", "content": SOLVER_SYSTEM_PROMPT},
# {"role": "user", "content": "### Task: Solve Problem\nProblem: {question}\nSolution:"}
# ]
Files Using Centralized Prompts
scripts/run_grpo_training.py- GRPO question generationscripts/dual_task_sft_pipeline.py- SFT trainingscripts/create_dual_task_dataset.py- Dataset creationsrc/rl/math_environment.py- PPO environmentsrc/rl/triple_verifier.py- Consensus verification
Prompt Design Principles
Question Generation
- No explicit step constraints: Let the model decide complexity naturally
- Focus on realistic scenarios and simple operations (grade-school level)
- Output only the problem statement, no solutions
Solution Generation
- Step-by-step format: Each step on its own line starting with "Step N:"
- Final Answer format: Line starting with "Final Answer:"
- Python/SymPy syntax: All math expressions verifiable programmatically