tau train hard no sys mix (user gpt4.1)
retail
β π Average Reward: 0.6096ββ π Pass^k Metrics:β k=1: 0.610β k=2: 0.509
-