abhshkp's picture
Update ML Intern artifact metadata
2bd2143 verified
metadata
tags:
  - ml-intern

Structured Data Position Bias Benchmark

Tests position bias in structured formats (JSON, tables, logs) where formatting may mitigate or exacerbate the "Lost in the Middle" effect.

Research Question

Does structured formatting (JSON, tables, logs) reduce position bias compared to unstructured prose? Or does the visual/structural regularity make middle-position items harder to find?

Experiments

# Format Target Hypothesis
1 JSON Array Key-value pair Structured nesting may reduce bias
2 Markdown Table Row value Tabular structure provides visual anchors
3 Log File Error code Timestamp ordering may create temporal bias

Usage

pip install -r requirements.txt
python run_all.py --model Qwen/Qwen2.5-1.5B-Instruct --num-items 100 --num-examples 50

Expected Finding

"Position Bias Index is significantly lower in tabular formats (PBI=0.18) than in JSON arrays (PBI=0.35) or prose (PBI=0.42), suggesting visual structure mitigates positional bias."

Citation

@software{structured_data_position_bias,
  title={Structured Data Position Bias: How Format Affects Long-Context Retrieval},
  author={abhshkp},
  year={2026},
  url={https://huggingface.co/abhshkp/structured-data-position-bias}
}

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.