Spaces:

Ted412
/

EgoMemReason

Running

App Files Files Community

EgoMemReason / README.md

Ziyang Wang

add arXiv paper

1b38b03 about 23 hours ago

preview code

raw

history blame contribute delete

2.59 kB

	---
	title: EgoMemReason Leaderboard
	emoji: 🧠
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: 5.15.0
	python_version: "3.12"
	app_file: app.py
	pinned: false
	license: cc-by-nc-4.0
	hf_oauth: true
	---

	# EgoMemReason — Leaderboard Space

	Live leaderboard for the EgoMemReason benchmark: 500 multiple-choice questions over week-long egocentric video, evaluating entity / event / behavior memory.

	- 🌐 Project page: <https://egomemreason.github.io/>
	- 📄 Paper: <https://arxiv.org/abs/2605.09874>
	- 💻 Reference eval scripts: <https://github.com/Ziyang412/EgoMemReason>
	- 📦 Public questions: <https://huggingface.co/datasets/Ted412/EgoMemReason>
	- 🎬 Source frames: <https://egolife-ai.github.io/>

	## Operator notes

	This Space lives at `Ted412/EgoMemReason` and writes one JSON record per submission to the public dataset `Ted412/EgoMemReason-Leaderboard`. The held-out answer key lives in a separate private dataset `Ted412/EgoMemReason-Private` and is pulled at boot via `snapshot_download(token=HF_TOKEN)`.

	### Required Space secret

	\| Name \| Value \| Scope \|
	\|---\|---\|---\|
	\| `HF_TOKEN` \| Fine-grained HF token \| Write on `Ted412/EgoMemReason-Leaderboard` + Read on `Ted412/EgoMemReason-Private` \|

	Create at <https://huggingface.co/settings/tokens> → fine-grained → grant only those two repos.

	### Local development

	```bash
	python -m venv .venv && source .venv/bin/activate
	pip install -r requirements.txt

	# Copy the private answer key into cwd (skips the snapshot_download path).
	cp ../EgoMemReason-EvalAI.archived/annotations/annotations_private.json .

	# Run, optionally faking a user.
	DEBUG_USER=alice python app.py
	# → http://127.0.0.1:7860
	```

	Tests:

	```bash
	python -m pytest tests/ -q
	```

	### Architecture

	```
	EgoMemReason-Space (this Space, public)
	├── app.py Gradio UI (Leaderboard / Submit / Manage / About)
	├── evaluator.py pure scoring — port of the old EvalAI main.py
	├── ledger.py HF I/O: pulls private annotations at boot; writes
	│ one JSON record per submission to the public dataset
	├── auth.py resolves the HF username from gr.OAuthProfile
	└── annotations_private.json pulled at boot from the private dataset

	Ted412/EgoMemReason-Private (HF dataset, private)
	└── annotations_private.json 500 Qs WITH correct_answer

	Ted412/EgoMemReason-Leaderboard (HF dataset, public)
	└── submissions/
	└── <uuid>.json one immutable record per submission
	(only is_selected flips on a re-write)
	```