LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory Paper • 2410.10813 • Published Oct 14, 2024 • 16
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16, 2025 • 69
arcee-ai/Trinity-Large-Thinking Text Generation • 399B • Updated 5 days ago • 15.8k • • 153
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 293
huggingface-course/supervised-finetuning_quiz_student_responses Viewer • Updated about 1 hour ago • 10 • 576 • 3
DavidAU/Qwen3.5-40B-Claude-4.5-Opus-High-Reasoning-Thinking Image-Text-to-Text • 40B • Updated 28 days ago • 726 • 40
Qwen/Qwen3.5-397B-A17B Image-Text-to-Text • 403B • Updated about 1 month ago • 782k • • 1.44k