Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile

This model was quantized and converted to GGUF format from Qwen/Qwen3-Coder-30B-A3B-Instruct using llama.cpp via the ggml.ai's GGUF-my-repo space. The resulting GGUF was then packaged into a llamafile using llamafile v0.9.3 and the .args file present in this repository.

Refer to the original model card for more details on the model.

Refer to GGUF-my-repo for more details on the quantization (Q4_K_M) and conversion to GGUF.

Refer to llamafile for more details on the accessible llamafile format, the sandboxing and performant server it provides, and how to use it.

Quickstart

Download llamafile and ensure it is executable

wget https://huggingface.co/chrisolof/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile/resolve/main/qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile
chmod +x qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile

Run command line chat interface

./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile

Run server

./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile --server --v2 --listen 127.0.0.1:8011

See available v2 (LLaMAfiler) server flags via ./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile --server --v2 --help and see LLaMAfiler Documentation.

Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chrisolof/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile

Finetuned
(51)
this model