Inference Providers
Active filters: 8-bit
HF1BitLLM/Llama3-8B-1.58-100B-tokens
Text Generation
• 3B • Updated • 2.34k
• 211
MaziyarPanahi/Llama-3.2-3B-Instruct-GGUF
Text Generation
• 3B • Updated • 85.9k
• 15
lmstudio-community/Qwen2.5-Coder-14B-Instruct-MLX-8bit
Text Generation
• 4B • Updated • 106k
• 2
mlx-community/DeepSeek-R1-Distill-Llama-8B-8bit
2B • Updated • 131
• 4
Text Generation
• 397B • Updated • 4.41k
• 277
MaziyarPanahi/Phi-4-mini-instruct-GGUF
Text Generation
• 4B • Updated • 92k
• 12
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 46.3k
• 30
nvidia/DeepSeek-V3-0324-NVFP4
Text Generation
• 397B • Updated • 38.5k
• 17
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-8bit
Text Generation
• 2B • Updated • 323k
• 16
nvidia/DeepSeek-R1-0528-NVFP4
Text Generation
• 397B • Updated • 6.6k
• 44
mlx-community/DiffuCoder-7B-cpGRPO-8bit
Text Generation
• 8B • Updated • 93
• 9
nvidia/Qwen3-235B-A22B-NVFP4
Text Generation
• 133B • Updated • 9.68k
• 16
mlx-community/SmolLM3-3B-8bit
Text Generation
• Updated • 95
• 9
nvidia/DeepSeek-R1-NVFP4-v2
Text Generation
• 394B • Updated • 6.07k
• 7
mlx-community/GLM-4.5-Air-8bit
Text Generation
• 107B • Updated • 2.31k
• 9
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit
Text Generation
• 31B • Updated • 200k
• 15
ramblingpolymath/Qwen3-Coder-30B-A3B-Instruct-W8A8
Text Generation
• 31B • Updated • 437
• 3
Text Generation
• 120B • Updated • 54.2k
• 21
huizimao/gpt-oss-20b-helpful-MXFP4-QAT
21B • Updated • 2
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.38k
• 9
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 118k
• 9
Text Generation
• 5B • Updated • 29.1k
• 17
Text Generation
• 8B • Updated • 31.1k
• 8
Text Generation
• 17B • Updated • 127k
• 15
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 24.1k
• 15
xxrjun/gpt-oss-120b-mxfp4
120B • Updated • 16
• 1
Text Generation
• 5B • Updated • 2.37k
• 2
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4
Text Generation
• 26B • Updated • 13.9k
• 17
MaziyarPanahi/VibeThinker-1.5B-GGUF
Text Generation
• 2B • Updated • 452
• 36
lmstudio-community/Olmo-3-7B-Instruct-MLX-8bit
7B • Updated • 332
• 1