Papers-LLMEval
updated
Latxa: An Open Language Model and Evaluation Suite for Basque
Paper
• 2403.20266
• Published • 4
TrustLLM: Trustworthiness in Large Language Models
Paper
• 2401.05561
• Published • 69
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
• 2405.01535
• Published • 124
Beyond Scaling Laws: Understanding Transformer Performance with
Associative Memory
Paper
• 2405.08707
• Published • 34
tinyBenchmarks: evaluating LLMs with fewer examples
Paper
• 2402.14992
• Published • 17
meta-llama/Llama-3.3-70B-Instruct-evals
Viewer
• Updated • 41.3k • 155
• 44
RUC-NLPIR/OmniEval-HallucinationEvaluator
Text Generation
• Updated • 1
Viewer
• Updated • 92 • 923
• 27
Benchmark
• Updated • 17.6k • 795k
• 1.26k
Preview
• Updated • 62
• 4
KRLabsOrg/lettucedect-base-modernbert-en-v1
Token Classification
• 0.1B • Updated • 5.15k
• 17
Viewer
• Updated • 269 • 618
• 47