-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2602.10604
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
Step-3.5-Flash Chatbot
🚀45Run interactive Streamlit apps directly in your browser
-
stepfun-ai/Step-3.5-Flash
Text Generation • 199B • Updated • 143k • • 782 -
stepfun-ai/Step-3.5-Flash-FP8
Text Generation • 199B • Updated • 325k • 51 -
stepfun-ai/Step-3.5-Flash-GGUF-Q4_K_S
Text Generation • 197B • Updated • 8.86k • 142
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Step-3.5-Flash Chatbot
🚀45Run interactive Streamlit apps directly in your browser
-
stepfun-ai/Step-3.5-Flash
Text Generation • 199B • Updated • 143k • • 782 -
stepfun-ai/Step-3.5-Flash-FP8
Text Generation • 199B • Updated • 325k • 51 -
stepfun-ai/Step-3.5-Flash-GGUF-Q4_K_S
Text Generation • 197B • Updated • 8.86k • 142
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66