CMU-LTI

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

VanishD authored a paper 11 days ago

T-Eval: Evaluating the Tool Utilization Capability Step by Step

VanishD authored a paper 11 days ago

Building Cooperative Embodied Agents Modularly with Large Language Models

VanishD authored a paper 11 days ago

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

View all activity

Papers

Benchmark Test-Time Scaling of General LLM Agents

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

View all Papers

authored 7 papers 11 days ago

T-Eval: Evaluating the Tool Utilization Capability Step by Step

Paper • 2312.14033 • Published Dec 21, 2023 • 2

Building Cooperative Embodied Agents Modularly with Large Language Models

Paper • 2307.02485 • Published Jul 5, 2023 • 12

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

Paper • 2401.12975 • Published Jan 23, 2024

Agentic-R1: Distilled Dual-Strategy Reasoning

Paper • 2507.05707 • Published Jul 8, 2025

Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management

Paper • 2510.06727 • Published Oct 8, 2025 • 5

Training Proactive and Personalized LLM Agents

Paper • 2511.02208 • Published Nov 4, 2025

Mind the Sim2Real Gap in User Simulation for Agentic Tasks

Paper • 2603.11245 • Published Mar 11

updated a dataset 12 days ago

cmu-lti/tau-usi

Updated 12 days ago • 12

published a dataset 12 days ago

cmu-lti/tau-usi

Updated 12 days ago • 12

updated a dataset about 2 months ago

cmu-lti/machine-translation-for-vision

Viewer • Updated Mar 3 • 696 • 1.09k • 1

lixiaochuan2020

submitted a paper to Daily Papers about 2 months ago

Benchmark Test-Time Scaling of General LLM Agents

Paper • 2602.18998 • Published Feb 22 • 9

authored a paper about 2 months ago

Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents

Paper • 2602.05073 • Published Feb 4 • 11

published a Space 2 months ago

MachineTranslationforVision

Explore competition details and submit entries

updated a Space 3 months ago

MachineTranslationforVision

Explore competition details and submit entries

submitted a paper to Daily Papers 3 months ago

CooperBench: Why Coding Agents Cannot be Your Teammates Yet

Paper • 2601.13295 • Published Jan 19 • 5

authored 2 papers 3 months ago

PRiSM: Benchmarking Phone Realization in Speech Models

Paper • 2601.14046 • Published Jan 20 • 7

Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects

Paper • 2601.07274 • Published Jan 12 • 1

authored a paper 3 months ago

Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations

Paper • 2305.12715 • Published May 22, 2023

authored 2 papers 4 months ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Paper • 2505.23840 • Published May 28, 2025 • 3

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79