ml-intern / eval

Commit History

fix: properly close SDK message on error, show tool errorText
c68afb6

akseljoonas commited on

Revert "fix: show errorText for failed tools, bump eval max_iterations to 300"
dd8076b

akseljoonas commited on

fix: show errorText for failed tools, bump eval max_iterations to 300
53a006d

akseljoonas commited on

remove lmnr dependency
1de37c8

akseljoonas commited on

functioning frontend and docker
32f62c3

akseljoonas commited on

generated, filled in and verfied 250 eval questions
8541221

akseljoonas commited on

intermediate commit until i let amp loose
158d846

akseljoonas commited on

eval readme update
8f4b322

akseljoonas commited on

gpt 5 nano judge
9fe493b

akseljoonas commited on

fixing tracing
9de209d

akseljoonas commited on

adding claude code + mcp
00d49da

akseljoonas commited on

leaderboard and results
be350cb

akseljoonas commited on

updated eval
035d186

akseljoonas commited on

dataset creation script
f92b0c4

akseljoonas commited on

adding readme
f2e6e35

akseljoonas commited on

adding hf datasets i/o
3da9761

akseljoonas commited on

removing jsonl csv
78e49c7

akseljoonas commited on

eval script done
af80aa7

akseljoonas commited on

modified eval prompt
124a8a4

akseljoonas commited on

thinking if we want eval or not
2b6a536

akseljoonas commited on