🧑‍⚖️ LLM-as-a-judge - a m-ric Collection

m-ric 's Collections

Could be useful one day

Scaling Laws 📏

🚀 Spinning Up in LLMs

🧑‍⚖️ LLM-as-a-judge

🔎⇒💬 RAG

🛣️ Grammar

💡 Interpretability - understanding LLMs

LLM foundations

🔧 Optimization Mechanics 🔧

Open-source AI Releases - August '24

Mother of all Training Clusters

🧑‍⚖️ LLM-as-a-judge

updated Nov 21, 2024

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 42
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
Leveraging Large Language Models for NLG Evaluation: A Survey

Paper • 2401.07103 • Published Jan 13, 2024 • 4
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 57
Running

Agents

111

Judge Arena

💻

111

View and compare open‑source AI model rankings with ELO scores