GlobeSumm: A Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization Paper • 2410.04087 • Published Oct 5, 2024
Causal Tracing of Object Representations in Large Vision Language Models: Mechanistic Interpretability and Hallucination Mitigation Paper • 2511.05923 • Published Nov 8, 2025
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management Paper • 2601.08435 • Published Jan 13
ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models Paper • 2604.08064 • Published 6 days ago • 8
The Role of Summarization in Generative Agents: A Preliminary Perspective Paper • 2305.01253 • Published May 2, 2023 • 1
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models Paper • 2403.00231 • Published Mar 1, 2024 • 2
Hierarchical Catalogue Generation for Literature Review: A Benchmark Paper • 2304.03512 • Published Apr 7, 2023
Length Extrapolation of Transformers: A Survey from the Perspective of Positional Encoding Paper • 2312.17044 • Published Dec 28, 2023
Learning Fine-Grained Grounded Citations for Attributed Large Language Models Paper • 2408.04568 • Published Aug 8, 2024
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18, 2025 • 19
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26, 2025 • 104