EgoMemReason / SUBMISSION_FORMAT.md
Ziyang Wang
update GitHub URL to Ziyang412/EgoMemReason
9cf02ac

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

Submission Format

A submission is a single JSON file (.json) containing a top-level array of 500 prediction objects — one per question.

Schema

[
  {"example_id": 1,   "predicted_answer": "A"},
  {"example_id": 2,   "predicted_answer": "C"},
  {"example_id": 500, "predicted_answer": "B"}
]

Required keys (per object):

  • example_id — integer in [1, 500], matching example_id in annotations_public.json.
  • predicted_answer — single uppercase letter that appears in that question's options dict.

Important: questions have between 4 and 10 options. The valid answer letters for any given question are exactly the keys of its options dict. Most are A-F; Event Ordering questions can extend to A-J. A letter outside the question's option set is rejected.

Optional keys (ignored, but useful for your own debugging): raw_response, confidence, tokens, etc.

Rules

  1. Top-level must be a JSON array (not an object).
  2. The submission must cover exactly 500 unique example_ids, one per question.
  3. Duplicate example_ids are rejected.
  4. Letters must be uppercase (whitespace is trimmed).
  5. File extension must be .json.

Converting from existing eval-script output

The reference inference scripts in the EgoMemReason GitHub repo write a list of records with a pred field. One-liner to convert:

import json
src = json.load(open("results_my_model.json"))
sub = [{"example_id": r["example_id"], "predicted_answer": r["pred"]} for r in src]
json.dump(sub, open("submission.json", "w"))

How submissions are scored

Accuracy (%) for each of the six query_type splits:

  • Cumulative State Tracking (100 Qs)
  • Temporal Counting (100 Qs)
  • Event Ordering (100 Qs)
  • Event Linking (100 Qs)
  • Spatial Preference (50 Qs)
  • Activity Pattern (50 Qs)

plus Overall accuracy on all 500. All seven values appear on the leaderboard; ranking is by Overall descending.

Submission limits

  • 5 submissions per HF user per 24-hour window.
  • The 24-hour window is rolling, not midnight-aligned.

Selected submission

Submit as many times as you like under the cap. In the Manage my submissions tab you can mark one of your past submissions as your selected entry. The default leaderboard view shows only each team's selected entry; the "Show all submissions" toggle reveals all.

Required metadata fields

When you submit you must fill in:

Field Required Notes
team_name yes Team or affiliation
method_name yes Short title displayed on the leaderboard
uses_external_data yes (yes/no) Did you train / finetune on anything beyond EgoLife?
uses_video_frames yes one of frames-only · video-only · frames+audio · captions-only · other
model_size no e.g. 8B, 32B, API
method_description no Free-form description
project_url no Project page
publication_url no arXiv / OpenReview link