MM-Hallu
/

RePOPE

hallucination-detection

object-hallucination

Model card Files Files and versions

RePOPE / README.md

chenhaoguan's picture

Upload README.md with huggingface_hub

56d2c5a verified 13 days ago

|

history blame contribute delete

2.11 kB

	---
	license: mit
	task_categories:
	- visual-question-answering
	language:
	- en
	tags:
	- hallucination-detection
	- object-hallucination
	- pope
	- coco
	- benchmark
	size_categories:
	- 1K<n<10K
	---

	# RePOPE: Revisiting Partial Object Hallucination Evaluation

	RePOPE is a re-annotated version of the POPE (Polling-based Object Probing Evaluation) benchmark with corrected ground-truth labels. It evaluates object hallucination in multimodal large language models (MLLMs) by asking yes/no questions about object existence in MSCOCO images.

	## Dataset Details

	- Original Paper: [RePOPE: Revisiting Partial Object Hallucination Evaluation](https://arxiv.org/abs/2405.14571)
	- Original Repository: [https://github.com/YanNeu/RePOPE](https://github.com/YanNeu/RePOPE)
	- Images: MSCOCO 2014 (subset of 500 images)

	## Dataset Structure

	Each row contains:

	- `image`: The MSCOCO image (struct with `bytes` and `path`)
	- `image_id`: COCO image identifier (e.g., `000000310196`)
	- `question`: A yes/no question about object presence (e.g., "Is there a snowboard in the image?")
	- `answer`: Ground truth label (`yes` or `no`)
	- `category`: Sampling strategy used to select the queried object (`random`, `popular`, or `adversarial`)

	### Splits

	This dataset contains all three POPE sampling categories in a single split:

	\| Category \| Count \|
	\|---------------\|-------\|
	\| random \| 2,774 \|
	\| popular \| 2,727 \|
	\| adversarial \| 2,684 \|
	\| Total \| 8,185 \|

	### Label Distribution

	\| Answer \| Count \|
	\|--------\|-------\|
	\| yes \| 3,539 \|
	\| no \| 4,646 \|

	## How to Use

	```python
	from datasets import load_dataset

	ds = load_dataset("MM-Hallu/RePOPE")
	```

	## Citation

	```bibtex
	@misc{neuhaus2024repope,
	title={RePOPE: Revisiting Partial Object Hallucination Evaluation},
	author={Yannik Neuschwander and Selen Yu and Jordy Van Landeghem and Jan Van Loock and Lilian Ngweta and Rukiye Savran Kizildag and Desmond Elliott and Matthew B. Blaschko},
	year={2024},
	eprint={2405.14571},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```