Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.00949

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Open, Production-ready Enterprise Models

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5

Text Generation • 50B • Updated Oct 15, 2025 • 100k • 231
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8

Text Generation • 50B • Updated Oct 15, 2025 • 42k • 27
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • Updated Oct 15, 2025 • 7k • • 345
nvidia/Llama-3_3-Nemotron-Super-49B-v1

Text Generation • 50B • Updated Oct 15, 2025 • 32k • 322

gretelai/synthetic_text_to_sql

Viewer • Updated Dec 16, 2025 • 106k • 2.1k • 642
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Wan-AI/Wan2.2-T2V-A14B-Diffusers

Text-to-Video • Updated Aug 9, 2025 • 78k • • 128
Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2, 2025 • 43

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339
Qwen/Qwen3-14B-GGUF

Text Generation • 15B • Updated May 9, 2025 • 85.4k • 82
Qwen/Qwen3-8B-GGUF

Text Generation • 8B • Updated May 21, 2025 • 51.9k • 182
Qwen/Qwen3-4B-GGUF

Text Generation • 4B • Updated May 21, 2025 • 74.1k • 95

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12, 2024 • 62
MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12, 2024 • 23
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Paper • 2407.09121 • Published Jul 12, 2024 • 6
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19, 2024 • 26

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 125
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339
Qwen/Qwen3-14B-GGUF

Text Generation • 15B • Updated May 9, 2025 • 85.4k • 82
Qwen/Qwen3-8B-GGUF

Text Generation • 8B • Updated May 21, 2025 • 51.9k • 182
Qwen/Qwen3-4B-GGUF

Text Generation • 4B • Updated May 21, 2025 • 74.1k • 95

Open, Production-ready Enterprise Models

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5

Text Generation • 50B • Updated Oct 15, 2025 • 100k • 231
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8

Text Generation • 50B • Updated Oct 15, 2025 • 42k • 27
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • Updated Oct 15, 2025 • 7k • • 345
nvidia/Llama-3_3-Nemotron-Super-49B-v1

Text Generation • 50B • Updated Oct 15, 2025 • 32k • 322

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12, 2024 • 62
MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12, 2024 • 23
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Paper • 2407.09121 • Published Jul 12, 2024 • 6
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19, 2024 • 26

gretelai/synthetic_text_to_sql

Viewer • Updated Dec 16, 2025 • 106k • 2.1k • 642
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 251
Wan-AI/Wan2.2-T2V-A14B-Diffusers

Text-to-Video • Updated Aug 9, 2025 • 78k • • 128
Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2, 2025 • 43

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 125
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs