tasks_tough: add 42 more tough scenarios + baseline profiler fe54c01 Don Rishabh Claude Opus 4.7 (1M context) commited on 15 days ago
tasks_tough: add 10 domain-classifier tough scenarios (seed batch) 25d9413 Don Rishabh Claude Opus 4.7 (1M context) commited on 15 days ago