Appear2Meaning: A Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images Paper • 2604.07338 • Published 11 days ago • 5
Appear2Meaning: A Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images Paper • 2604.07338 • Published 11 days ago • 5
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published Jun 16, 2025 • 94
All That Glisters Is Not Gold: A Benchmark for Reference-Free Counterfactual Financial Misinformation Detection Paper • 2601.04160 • Published Jan 7 • 4
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 11
The FinBen: An Holistic Financial Benchmark for Large Language Models Paper • 2402.12659 • Published Feb 20, 2024 • 24
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design Paper • 2311.13743 • Published Nov 23, 2023 • 2
All That Glisters Is Not Gold: A Benchmark for Reference-Free Counterfactual Financial Misinformation Detection Paper • 2601.04160 • Published Jan 7 • 4
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 11
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 11
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent Paper • 2412.18174 • Published Dec 24, 2024 • 2
ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model Paper • 2403.06765 • Published Mar 11, 2024
EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis Paper • 2401.08508 • Published Jan 16, 2024 • 1
The FinBen: An Holistic Financial Benchmark for Large Language Models Paper • 2402.12659 • Published Feb 20, 2024 • 24