Structured Data Position Bias Benchmark

Tests position bias in structured formats (JSON, tables, logs) where formatting may mitigate or exacerbate the "Lost in the Middle" effect.

Research Question

Does structured formatting (JSON, tables, logs) reduce position bias compared to unstructured prose? Or does the visual/structural regularity make middle-position items harder to find?

Experiments

#	Format	Target	Hypothesis
1	JSON Array	Key-value pair	Structured nesting may reduce bias
2	Markdown Table	Row value	Tabular structure provides visual anchors
3	Log File	Error code	Timestamp ordering may create temporal bias

Usage

pip install -r requirements.txt
python run_all.py --model Qwen/Qwen2.5-1.5B-Instruct --num-items 100 --num-examples 50

Expected Finding

"Position Bias Index is significantly lower in tabular formats (PBI=0.18) than in JSON arrays (PBI=0.35) or prose (PBI=0.42), suggesting visual structure mitigates positional bias."

Citation

@software{structured_data_position_bias,
  title={Structured Data Position Bias: How Format Affects Long-Context Retrieval},
  author={abhshkp},
  year={2026},
  url={https://huggingface.co/abhshkp/structured-data-position-bias}
}

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Try ML Intern: https://smolagents-ml-intern.hf.space
Source code: https://github.com/huggingface/ml-intern

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support