dragon-dbh / benchmark_results.md
ThomasTheMaker's picture
Upload folder using huggingface_hub
efc7a23 verified

Benchmark Results

Timestamp Task Accuracy Model Max Problems
2026-01-12T02:44:09.419982 ARC-Easy 0.00% checkpoints/final_135m.pt 5
2026-01-12T02:44:09.419982 ARC-Challenge 0.00% checkpoints/final_135m.pt 5
2026-01-12T02:44:09.419982 MMLU 0.00% checkpoints/final_135m.pt 5
2026-01-12T02:44:09.419982 GSM8K 0.00% checkpoints/final_135m.pt 5
2026-01-12T02:44:09.419982 HumanEval 0.00% checkpoints/final_135m.pt 5
2026-01-12T02:44:09.419982 SpellingBee 0.00% checkpoints/final_135m.pt 5