MCQ_eval
🚀
Evaluate multiple-choice question predictions
This collection gather all the metrics used for the evaluation of the datasets in GeoBenchLLM.
Evaluate multiple-choice question predictions
Evaluate coordinate predictions with a simple Gradio UI
Evaluate your model on New York POI dataset
Evaluate keyword extraction with precision, recall, F1 scores
Evaluate regression model predictions with key metrics
Evaluate path planning results with performance metrics
Evaluate your generated place images