Running Agents 230 BigCodeBench Leaderboard 🥇 230 Explore code-generation model leaderboards and task details
Running Featured 452 LLM Performance Leaderboard 🐨 452 View the latest LLM performance leaderboard online
Running Agents 95 Nexus Function Calling Leaderboard 🐠 95 Display benchmark results for models on various tasks