--- tags: - ml-intern --- # Structured Data Position Bias Benchmark Tests position bias in **structured formats** (JSON, tables, logs) where formatting may mitigate or exacerbate the "Lost in the Middle" effect. ## Research Question > Does structured formatting (JSON, tables, logs) reduce position bias compared to unstructured prose? Or does the visual/structural regularity make middle-position items harder to find? ## Experiments | # | Format | Target | Hypothesis | |---|--------|--------|-----------| | 1 | **JSON Array** | Key-value pair | Structured nesting may reduce bias | | 2 | **Markdown Table** | Row value | Tabular structure provides visual anchors | | 3 | **Log File** | Error code | Timestamp ordering may create temporal bias | ## Usage ```bash pip install -r requirements.txt python run_all.py --model Qwen/Qwen2.5-1.5B-Instruct --num-items 100 --num-examples 50 ``` ## Expected Finding > "Position Bias Index is significantly lower in tabular formats (PBI=0.18) than in JSON arrays (PBI=0.35) or prose (PBI=0.42), suggesting visual structure mitigates positional bias." ## Citation ```bibtex @software{structured_data_position_bias, title={Structured Data Position Bias: How Format Affects Long-Context Retrieval}, author={abhshkp}, year={2026}, url={https://huggingface.co/abhshkp/structured-data-position-bias} } ``` ## Generated by ML Intern This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. - Try ML Intern: https://smolagents-ml-intern.hf.space - Source code: https://github.com/huggingface/ml-intern