Position Bias Taxonomy: Cross-Task Framework

A unified framework that evaluates and classifies position bias across different cognitive task types, revealing that position bias is not a single scalar but a task-dependent phenomenon.

Research Question

Does a model's position bias on retrieval tasks predict its position bias on reasoning, summarization, or translation tasks? Or are these independent dimensions of model behavior?

Position Bias Taxonomy

Type Pattern Description
Primacy High β†’ Low Best at start
Recency Low β†’ High Best at end
U-Shaped High β†’ Low β†’ High Classic LITM
Middle-Sag Flat β†’ Low β†’ Flat Only middle suffers
Flat ~Constant No position effect
Inverted-U Low β†’ High β†’ Low Middle best (rare)

Position Bias Index (PBI)

PBI = (acc_start + acc_end) / 2 - acc_middle
  • PBI > 0.3: Strong U-shape
  • PBI β‰ˆ 0: Flat (good)
  • PBI < 0: Inverted-U (rare)

Weighted PBI accounts for full curve shape using Simpson's rule for AUC difference between edge and middle regions.

5 Task Types

Task Cognitive Demand Expected Bias
KV Retrieval Simple lookup Strong U-shape
Needle in Haystack Text search Strong U-shape
Fact-Dependent Reasoning Reasoning + retrieval Moderate U-shape
Summarization Comprehension + compression Weak/moderate
Translation Understanding + generation Task-dependent

Usage

pip install -r requirements.txt

# Single model, all tasks
python run_all.py --models Qwen/Qwen2.5-1.5B-Instruct

# Multiple models
python run_all.py --models Qwen/Qwen2.5-1.5B-Instruct meta-llama/Llama-3.2-1B-Instruct

# Specific tasks only
python run_all.py --tasks kv_retrieval reasoning --num-examples 50

Expected Headline Result

"PBI on retrieval tasks correlates poorly with PBI on reasoning tasks (r=0.12), suggesting position bias is not a unified model property but a task-dependent phenomenon."

Output

results/
β”œβ”€β”€ taxonomy_Qwen_Qwen2.5-1.5B-Instruct.json
β”œβ”€β”€ taxonomy_meta-llama_Llama-3.2-1B-Instruct.json
└── ...

Each JSON contains task-level accuracies, PBI, classification, and cross-task statistics.

Citation

@software{position_bias_taxonomy,
  title={Position Bias Taxonomy: A Cross-Task Framework for Long-Context Evaluation},
  author={abhshkp},
  year={2026},
  url={https://huggingface.co/abhshkp/position-bias-taxonomy}
}

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support