Inference Providers
Active filters: cuda
Text Generation
• 8B • Updated • 89.7k
• 636
prism-ml/Bonsai-1.7B-gguf
Text Generation
• 2B • Updated • 16.8k
• 59
Text Generation
• 4B • Updated • 11.9k
• 37
ussoewwin/Flash-Attention-2_for_Windows
Multilingual-Multimodal-NLP/IndustrialCoder
Text Generation
• 32B • Updated • 1.53k
• 57
ValiantLabs/gemma-4-E4B-it-ShiningValiant3
Image-Text-to-Text
• 8B • Updated • 9
• 3
ValiantLabs/gemma-4-E2B-it-ShiningValiant3
Image-Text-to-Text
• 5B • Updated • 27
• 2
aydin99/FLUX.2-klein-4B-int8
Text-to-Image
• Updated • 206
• 11
Multilingual-Multimodal-NLP/IndustrialCoder-Base
Text Generation
• 32B • Updated • 108
• 3
Multilingual-Multimodal-NLP/IndustrialCoder-Thinking
Text Generation
• 32B • Updated • 121
• 4
Text Generation
• Updated • 11
• 23
CalderaAI/13B-Ouroboros-GPTQ4bit-128g-CUDA
Text Generation
• Updated • 10
marcorez8/llama-cpp-python-windows-blackwell-cuda
ValiantLabs/Qwen3-8B-ShiningValiant3
Text Generation
• 8B • Updated • 15
• 3
mradermacher/Qwen3-8B-ShiningValiant3-GGUF
8B • Updated • 1.55k
• 2
mradermacher/Qwen3-8B-ShiningValiant3-i1-GGUF
8B • Updated • 540
• 2
ValiantLabs/Qwen3-1.7B-ShiningValiant3
Text Generation
• 2B • Updated • 34
• 5
mradermacher/Qwen3-1.7B-ShiningValiant3-GGUF
2B • Updated • 180
mradermacher/Qwen3-1.7B-ShiningValiant3-i1-GGUF
2B • Updated • 374
ValiantLabs/Qwen3-4B-ShiningValiant3
Text Generation
• 4B • Updated • 40
• 7
sequelbox/Qwen3-8B-PlumEsper
Text Generation
• 8B • Updated • 2
sequelbox/Qwen3-4B-PlumEsper
Text Generation
• 4B • Updated • 3
mradermacher/Qwen3-Shining-Lucy-CODER-3.5B-Brainstorm20x-e32-GGUF
3B • Updated • 202
• 2
mradermacher/Qwen3-Shining-Lucy-CODER-2.4B-mix2-GGUF
2B • Updated • 160
mradermacher/Qwen3-Shining-Lucy-CODER-2.4B-GGUF
2B • Updated • 109
mradermacher/Qwen3-Shining-Lucy-CODER-2.4B-mix2-i1-GGUF
2B • Updated • 99
• 1
mradermacher/Qwen3-Shining-Lucy-CODER-2.4B-i1-GGUF
2B • Updated • 597
mradermacher/Qwen3-Shining-Lucy-CODER-3.5B-Brainstorm20x-e32-i1-GGUF
3B • Updated • 536
• 1