ml-intern / eval

Commit History

feat: CLI local mode, slash commands, interrupt support; remove lmnr; frontend fixes
82b0c13

akseljoonas HF Staff Claude Opus 4.6 commited on

fix: properly close SDK message on error, show tool errorText
b22c5f3

akseljoonas HF Staff commited on

Revert "fix: show errorText for failed tools, bump eval max_iterations to 300"
f765eb4

akseljoonas HF Staff commited on

fix: show errorText for failed tools, bump eval max_iterations to 300
b2846f6

akseljoonas HF Staff commited on

functioning frontend and docker
ba93c86

akseljoonas HF Staff commited on

generated, filled in and verfied 250 eval questions
7534b92

akseljoonas HF Staff commited on

intermediate commit until i let amp loose
8bd1c22

akseljoonas HF Staff commited on

eval readme update
a9d2c33

akseljoonas HF Staff commited on

gpt 5 nano judge
0aa56ff

akseljoonas HF Staff commited on

fixing tracing
b402135

akseljoonas HF Staff commited on

rename
a3335a4

akseljoonas HF Staff commited on

rename
285c954

akseljoonas HF Staff commited on

adding claude code + mcp
eab219c

akseljoonas HF Staff commited on

leaderboard and results
df3b181

akseljoonas HF Staff commited on

link fix
f00b1a6

akseljoonas HF Staff commited on

updated eval
235ace7

akseljoonas HF Staff commited on

dataset creation script
73d437d

akseljoonas HF Staff commited on

adding readme
dc71e7b

akseljoonas HF Staff commited on

adding hf datasets i/o
c1fac32

akseljoonas HF Staff commited on

eval script done
bc84cfe

akseljoonas HF Staff commited on

eval runs
202a610

akseljoonas HF Staff commited on

modified eval prompt
f050c81

akseljoonas HF Staff commited on

thinking if we want eval or not
522a08c

akseljoonas HF Staff commited on