Fix #3: Add LLM-as-a-Judge (PASS/FAIL) + BERTScore evaluation — the two hackathon-required accuracy metrics f486777 verified muthuk1 commited on 8 days ago
Add Layer 4: Evaluation Layer (RAGAS + custom F1/EM metrics + benchmarking) 0117576 verified muthuk1 commited on 12 days ago