feat: integrate Trackio for experiment tracking and add Modal training infrastructure with environment and test utilities. 4e663d8 Humanlearning commited on 16 days ago
feat: implement core RL training infrastructure and architecture documentation f3080d1 Humanlearning commited on 17 days ago