Injecting Domain Adaptation with Learning-to-hash for Effective and Efficient Zero-shot Dense Retrieval Paper • 2205.11498 • Published May 23, 2022
ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget Paper • 2604.01195 • Published 13 days ago • 3
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8, 2025 • 42
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents Paper • 2504.13128 • Published Apr 17, 2025 • 7
Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses Paper • 2504.20006 • Published Apr 28, 2025
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published May 22, 2025 • 24