EgoMemReason / README.md
Ziyang Wang
add arXiv paper
1b38b03

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: EgoMemReason Leaderboard
emoji: 🧠
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.15.0
python_version: '3.12'
app_file: app.py
pinned: false
license: cc-by-nc-4.0
hf_oauth: true

EgoMemReason β€” Leaderboard Space

Live leaderboard for the EgoMemReason benchmark: 500 multiple-choice questions over week-long egocentric video, evaluating entity / event / behavior memory.

Operator notes

This Space lives at Ted412/EgoMemReason and writes one JSON record per submission to the public dataset Ted412/EgoMemReason-Leaderboard. The held-out answer key lives in a separate private dataset Ted412/EgoMemReason-Private and is pulled at boot via snapshot_download(token=HF_TOKEN).

Required Space secret

Name Value Scope
HF_TOKEN Fine-grained HF token Write on Ted412/EgoMemReason-Leaderboard + Read on Ted412/EgoMemReason-Private

Create at https://huggingface.co/settings/tokens β†’ fine-grained β†’ grant only those two repos.

Local development

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Copy the private answer key into cwd (skips the snapshot_download path).
cp ../EgoMemReason-EvalAI.archived/annotations/annotations_private.json .

# Run, optionally faking a user.
DEBUG_USER=alice python app.py
# β†’ http://127.0.0.1:7860

Tests:

python -m pytest tests/ -q

Architecture

EgoMemReason-Space (this Space, public)
β”œβ”€β”€ app.py            Gradio UI (Leaderboard / Submit / Manage / About)
β”œβ”€β”€ evaluator.py      pure scoring β€” port of the old EvalAI main.py
β”œβ”€β”€ ledger.py         HF I/O: pulls private annotations at boot; writes
β”‚                     one JSON record per submission to the public dataset
β”œβ”€β”€ auth.py           resolves the HF username from gr.OAuthProfile
└── annotations_private.json   pulled at boot from the private dataset

Ted412/EgoMemReason-Private (HF dataset, private)
└── annotations_private.json   500 Qs WITH correct_answer

Ted412/EgoMemReason-Leaderboard (HF dataset, public)
└── submissions/
    └── <uuid>.json   one immutable record per submission
                      (only is_selected flips on a re-write)