Spaces:
Runtime error
Apply for a GPU community grant: Academic project
GPU Grant Request β LLM Calibration Benchmark
Who I am: I'm Linwei Tao, a PhD student at the University of Sydney researching calibration and uncertainty estimation in Large Vision-Language Models (LVLMs).
What this Space does: This Space hosts an interactive calibration benchmark for LVLMs β it allows researchers to evaluate and compare the confidence calibration of models like LLaVA, InstructBLIP, and Qwen-VL on standard multimodal QA benchmarks (VQAv2, POPE, MMBench). It computes Expected Calibration Error (ECE), reliability diagrams, and confidence histograms.
Why I need GPU compute: Running calibration evaluation on large vision-language models (7Bβ13B parameters) requires significant GPU memory. The current CPU-only setup is too slow for interactive use β inference on a single model over a benchmark subset takes hours on CPU but would take minutes on a T4/L4 GPU. A GPU grant would make this tool genuinely usable for the research community.
Impact: This is an open-science project. All code is public, and the benchmark is designed to help other researchers quickly audit the calibration of their own models. I plan to publish the methodology at a top-tier venue (NeurIPS/CVPR) and will acknowledge Hugging Face in the paper and Space README.
Institution: University of Sydney, School of Computer Science
Contact: linwei.tao@sydney.edu.au
Personal site: https://www.taolinwei.com