File size: 2,585 Bytes
1bf5b23
 
 
 
 
 
9861d7d
60b3d29
1bf5b23
 
 
 
 
 
 
 
 
 
7f6d4bc
1b38b03
9cf02ac
1bf5b23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: EgoMemReason Leaderboard
emoji: 🧠
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.15.0
python_version: "3.12"
app_file: app.py
pinned: false
license: cc-by-nc-4.0
hf_oauth: true
---

# EgoMemReason β€” Leaderboard Space

Live leaderboard for the **EgoMemReason** benchmark: 500 multiple-choice questions over week-long egocentric video, evaluating entity / event / behavior memory.

- 🌐 Project page: <https://egomemreason.github.io/>
- πŸ“„ Paper: <https://arxiv.org/abs/2605.09874>
- πŸ’» Reference eval scripts: <https://github.com/Ziyang412/EgoMemReason>
- πŸ“¦ Public questions: <https://huggingface.co/datasets/Ted412/EgoMemReason>
- 🎬 Source frames: <https://egolife-ai.github.io/>

## Operator notes

This Space lives at `Ted412/EgoMemReason` and writes one JSON record per submission to the public dataset `Ted412/EgoMemReason-Leaderboard`. The held-out answer key lives in a separate **private** dataset `Ted412/EgoMemReason-Private` and is pulled at boot via `snapshot_download(token=HF_TOKEN)`.

### Required Space secret

| Name | Value | Scope |
|---|---|---|
| `HF_TOKEN` | Fine-grained HF token | Write on `Ted412/EgoMemReason-Leaderboard` + Read on `Ted412/EgoMemReason-Private` |

Create at <https://huggingface.co/settings/tokens> β†’ fine-grained β†’ grant only those two repos.

### Local development

```bash
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Copy the private answer key into cwd (skips the snapshot_download path).
cp ../EgoMemReason-EvalAI.archived/annotations/annotations_private.json .

# Run, optionally faking a user.
DEBUG_USER=alice python app.py
# β†’ http://127.0.0.1:7860
```

Tests:

```bash
python -m pytest tests/ -q
```

### Architecture

```
EgoMemReason-Space (this Space, public)
β”œβ”€β”€ app.py            Gradio UI (Leaderboard / Submit / Manage / About)
β”œβ”€β”€ evaluator.py      pure scoring β€” port of the old EvalAI main.py
β”œβ”€β”€ ledger.py         HF I/O: pulls private annotations at boot; writes
β”‚                     one JSON record per submission to the public dataset
β”œβ”€β”€ auth.py           resolves the HF username from gr.OAuthProfile
└── annotations_private.json   pulled at boot from the private dataset

Ted412/EgoMemReason-Private (HF dataset, private)
└── annotations_private.json   500 Qs WITH correct_answer

Ted412/EgoMemReason-Leaderboard (HF dataset, public)
└── submissions/
    └── <uuid>.json   one immutable record per submission
                      (only is_selected flips on a re-write)
```