RePOPE / README.md

Upload README.md with huggingface_hub

56d2c5a verified 13 days ago

2.11 kB

license: mit
task_categories:
  - visual-question-answering
language:
  - en
tags:
  - hallucination-detection
  - object-hallucination
  - pope
  - coco
  - benchmark
size_categories:
  - 1K<n<10K

RePOPE: Revisiting Partial Object Hallucination Evaluation

RePOPE is a re-annotated version of the POPE (Polling-based Object Probing Evaluation) benchmark with corrected ground-truth labels. It evaluates object hallucination in multimodal large language models (MLLMs) by asking yes/no questions about object existence in MSCOCO images.

Dataset Details

Original Paper: RePOPE: Revisiting Partial Object Hallucination Evaluation
Original Repository: https://github.com/YanNeu/RePOPE
Images: MSCOCO 2014 (subset of 500 images)

Dataset Structure

Each row contains:

image: The MSCOCO image (struct with bytes and path)
image_id: COCO image identifier (e.g., 000000310196)
question: A yes/no question about object presence (e.g., "Is there a snowboard in the image?")
answer: Ground truth label (yes or no)
category: Sampling strategy used to select the queried object (random, popular, or adversarial)

Splits

This dataset contains all three POPE sampling categories in a single split:

Category	Count
random	2,774
popular	2,727
adversarial	2,684
Total	8,185

Label Distribution

Answer	Count
yes	3,539
no	4,646

How to Use

from datasets import load_dataset

ds = load_dataset("MM-Hallu/RePOPE")

Citation

@misc{neuhaus2024repope,
      title={RePOPE: Revisiting Partial Object Hallucination Evaluation},
      author={Yannik Neuschwander and Selen Yu and Jordy Van Landeghem and Jan Van Loock and Lilian Ngweta and Rukiye Savran Kizildag and Desmond Elliott and Matthew B. Blaschko},
      year={2024},
      eprint={2405.14571},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}