ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch Paper • 2506.03558 • Published Jun 4, 2025 • 5
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28, 2025 • 45
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1, 2025 • 26
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1, 2025 • 26
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1, 2025 • 26 • 2
SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency Paper • 2502.02458 • Published Feb 4, 2025 • 1
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 66