Inference Providers
Active filters: quark
EliovpAI/Qwen3-14B-FP8-KV
Text Generation
• 15B • Updated • 4
• 2
matmelis/Llama_3.2_1B_w_uint4_autosmoothquant_gptq
0.4B • Updated • 3
matmelis/Llama_3.2_1B_w_uint4_qronos
0.4B • Updated • 12
matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_qronos
2B • Updated • 2
EliovpAI/Qwen3-0.6B-FP8-KV
Text Generation
• 0.6B • Updated • 2
matmelis/Llama_3.2_3B_w_mxfp4_a_mxfp4_gptq
2B • Updated • 5
matmelis/Llama_3.2_1B_w_int3_qronos
0.6B • Updated • 1
matmelis/Llama_3.2_3B_w_uint4_qronos
0.8B • Updated • 1
matmelis/Llama_3.2_3B_w_int3_qronos
1B • Updated • 5
matmelis/Llama_3.2_3B_w_int2_gptq
3B • Updated • 1
matmelis/Llama_3.2_3B_w_int3_gptq
3B • Updated • 2
matmelis/Llama_3.2_3B_w_int2_qronos
3B • Updated • 4
matmelis/Llama_3.2_1B_w_int3_gptq
0.6B • Updated • 2
matmelis/Llama_3.2_3B_w_uint4_gptq
0.8B • Updated • 1
matmelis/Llama_3.2_1B_w_int2_qronos
1B • Updated • 2
matmelis/Llama_3.2_1B_w_int2_gptq
1B • Updated • 9
matmelis/Llama_3.2_1B_w_uint4_smoothquant_qronos
0.4B • Updated • 1
amd/DeepSeek-R1-0528-MXFP4-ASQ
342B • Updated • 1.63k
• 1
haoyang-amd/output_oss_120b_moe_w_mxfp4_a_mxfp4
174B • Updated • 1
haoyang-amd/output_oss_20b_moe_w_mxfp4_a_bfloat16
11B • Updated • 1
matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_gptq
0.8B • Updated • 2
matmelis/Llama_3.2_1B_w_mxfp4_a_mxfp4_qronos
0.8B • Updated • 2
amd/Mixtral-8x7B-Instruct-v0.1-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
37B • Updated • 9
amd/Llama-2-70b-chat-hf-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
55B • Updated • 2
Keozon/GLM-4.5-Air-fp8_e4m3-quark-gfx1100
107B • Updated • 6
• 1
amd/Qwen3-8B-WMXFP4FP8-AMXFP4FP8-AMP-KVFP8
6B • Updated • 3.81k
• 2
amd/Qwen2.5-1.5B-Instruct-ptpc-Quark-ts
2B • Updated • 8.3k
EmbeddedLLM/Qwen3-Coder-480B-A35B-Instruct-FP8-Dynamic
480B • Updated • 2
amd/gpt-oss-120b-w-mxfp4-a-fp8
Updated • 3.35k
• 7
Quark-NPU-Workshop/Phi-3-mini-4k-instruct
0.6B • Updated • 2