feat: update README with GPU-utilization tuning instructions, enhance modal training script with run name parameter, and modify GRPO configuration for trace logging and vLLM settings
feat: enhance training image setup and add startup notice for Modal execution, improve dependency installation process, and implement training heartbeat for monitoring